


default search action
WACV 2025: Tucson, AZ, USA
- IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025, Tucson, AZ, USA, February 26 - March 6, 2025. IEEE 2025, ISBN 979-8-3315-1083-1
- Joanna Kaleta, Kacper Kania, Tomasz Trzcinski, Marek Kowalski:
LumiGauss: Relightable Gaussian Splatting in the Wild. 1-10 - Junjie Wang, Tomas Nordström
:
Latency Robust Cooperative Perception Using Asynchronous Feature Fusion. 1-10 - Jordan Voas, Wei-Cheng Tseng, Layne Berry, Xixi Hu, Puyuan Peng, James Stuedemann, David Harwath:
Temporally Streaming Audio-Visual Synchronization for Real-World Videos. 1-9 - Seul-Ki Yeom, Julian von Klitzing:
U-MixFormer: UNet-Like Transformer with Mix-Attention for Efficient Semantic Segmentation. 1-10 - Seong Jong Yoo, Snehesh Shrestha, Irina Muresanu, Cornelia Fermüller:
VioPose: Violin Performance 4D Pose Estimation by Hierarchical Audiovisual Inference. 1-12 - Jane Wu, Diego Thomas, Ronald Fedkiw:
Sparse-View 3D Reconstruction of Clothed Humans via Normal Maps. 11-22 - Théo Morales, Omid Taheri, Gerard Lacey:
A Versatile and Differentiable Hand-Object Interaction Representation. 23-33 - Kohei Matsuzaki, Keisuke Nonaka:
Point Cloud Color Upsampling with Attention-Based Coarse Colorization and Refinement. 34-43 - Vincenzo Polizzi, Marco Cannici, Davide Scaramuzza, Jonathan Kelly:
FaVoR: Features via Voxel Rendering for Camera Relocalization. 44-53 - Pallabjyoti Deka, Saumik Bhattacharya, Debashis Sen, Prabir Kumar Biswas:
3D Shape Completion using Multi-resolution Spectral Encoding. 54-63 - Alexander H. Berger, Laurin Lux, Suprosanna Shit, Ivan Ezhov, Georgios Kaissis, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzold:
Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers. 64-74 - Hossein Askari, Fred Roosta, Hongfu Sun:
Training-free Medical Image Inverses via Bi-level Guided Diffusion Models. 75-84 - Suhyun Ahn, Wonjung Park, Jihoon Cho, Jinah Park:
Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images. 85-95 - Trong-Thang Pham, Tien-Phat Nguyen, Yuki Ikebe, Akash Awasthi, Zhigang Deng, Carol C. Wu, Hien Nguyen, Ngan Le:
GazeSearch: Radiology Findings Search Benchmark. 96-106 - Yitong Li, Morteza Ghahremani, Youssef Wally, Christian Wachinger:
DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PET. 107-116 - Mevan Ekanayake, Zhifeng Chen
, Gary F. Egan, Mehrtash Harandi, Zhaolin Chen:
SeCo-INR: Semantically Conditioned Implicit Neural Representations for Improved Medical Image Super-Resolution. 117-126 - Majed El Helou, Doruk Cetin, Petar Stamenkovic, Niko Benjamin Huber, Fabio Zünd:
VerA: Versatile Anonymization Applicable to Clinical Facial Photographs. 127-138 - Dixi Yao:
Towards Privacy-Preserving Split Learning for ControlNet. 139-148 - Stefan Smeu, Elisabeta Oneata, Dan Oneata:
DeCLIP: Decoding CLIP Representations for Deepfake Localization. 149-159 - Maciej Chrabaszcz, Hubert Baniecki, Piotr Komorowski, Szymon Plotka, Przemyslaw Biecek:
Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models. 160-171 - Xin Hu, Janet Wang, Jihun Hamm, Rie Roselyne Yotsu, Zhengming Ding:
Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM. 172-181 - Hanxiao Tan:
Evaluating Sensitivity Consistency of Explanations. 182-191 - Pengxiao Wang, Tzu-Heng Lin, Chunyu Wang, Yizhou Wang:
Shift Equivariant Pose Network. 192-201 - Yunfei Li, Yuezun Li, Xin Wang, Baoyuan Wu, Jiaran Zhou, Junyu Dong:
Texture, Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection. 202-211 - Hai Wang
, Jing-Hao Xue:
360PanT: Training-Free Text-Driven 360-Degree Panorama-to-Panorama Translation. 212-221 - Enis Simsar, Alessio Tonioni, Yongqin Xian, Thomas Hofmann, Federico Tombari:
LIME: Localized Image Editing via Attention Regularization in Diffusion Models. 222-231 - Rohit Jena, Ali Taghibakhshi, Sahil Jain, Gerald Shen, Nima Tajbakhsh, Arash Vahdat:
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models. 232-242 - Qinpeng Cui, Xinyi Zhang, Qiqi Bao, Qingmin Liao:
Elucidating the Solution Space of Extended Reverse-Time SDE for Diffusion Models. 243-252 - Xiaofei Huang, Elaheh Hatamimajoumerd, Amal Mathew, Sarah Ostadabbas:
Infant Action Generative Modeling. 253-265 - Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori, Sahar Dastani, Milad Cheraghalikhani, David Osowiechi, Farzad Beizaee, Gustavo Adolfo Vargas Hakim, Ismail Ben Ayed, Christian Desrosiers:
Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging. 266-275 - Peizhi Yan, Rabab Ward, Qiang Tang, Shan Du:
Gaussian Déjà-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities. 276-286 - Kazuto Ichimaru, Diego Thomas, Takafumi Iwaguchi, Hiroshi Kawasaki:
Neural SDF for Shadow-Aware Unsupervised Structured Light. 287-296 - Mateusz Poleski, Jacek Tabor, Przemyslaw Spurek:
GeoGuide: Geometric Guidance of Diffusion Models. 297-305 - Simon Boeder
, Benjamin Risse:
OccFlowNet: Occupancy Estimation via Differentiable Rendering and Occupancy Flow. 306-316 - Boyuan Zhang, Zhenliang He, Meina Kan, Shiguang Shan:
Precise Integral in NeRFs: Overcoming the Approximation Errors of Numerical Quadrature. 317-326 - Cagla Deniz Bahadir, Gozde Bozdagi Akar, Mert R. Sabuncu:
LLM-Generated Rewrite and Context Modulation for Enhanced Vision Language Models in Digital Pathology. 327-336 - Yiying Wang, Abhirup Banerjee, Robin P. Choudhury, Vicente Grau:
DeepCA: Deep Learning-Based 3D Coronary Artery Tree Reconstruction from Two 2D Non-Simultaneous X-Ray Angiography Projections. 337-346 - Daniel Kim, Mohammed A. Al-masni, Jaehun Lee, Dong-Hyun Kim, Kanghyun Ryu:
Improving Pelvic MR-CT Image Alignment with Self-Supervised Reference-Augmented Pseudo-CT Generation Framework. 347-356 - Felix Wagner, Wentian Xu, Pramit Saha, Ziyun Liang, Daniel Whitehouse
, David K. Menon, Virginia F. J. Newcombe, Natalie Voets, J. Alison Noble, Konstantinos Kamnitsas:
Feasibility of Federated Learning from Client Databases with Different Brain Diseases and MRI Modalities. 357-367 - Shumpei Takezaki, Kiyohito Tanaka, Seiichi Uchida:
Self-Relaxed Joint Training: Sample Selection for Severity Estimation with Ordinal Noisy Labels. 368-377 - Tiancheng Gu, Kaicheng Yang, Xiang An, Ziyong Feng, Dongnan Liu, Tom Weidong Cai:
ORID: Organ-Regional Information Driven Framework for Radiology Report Generation. 378-387 - Snehashis Majhi, Mohammed Guermal, Antitza Dantcheva, Quan Kong, Lorenzo Garattoni, Gianpiero Francesca, François Brémond:
Guess Future Anomalies from Normalcy: Forecasting Abnormal Behavior in Real-World Videos. 388-398 - Seoyeon Gye, Junwon Ko, Hyounguk Shon, Minchan Kwon, Junmo Kim:
Reducing the Content Bias for AI-generated Image Detection. 399-408 - Jaehyeong Park, Juncheol Ye, Seungkook Lee, Hyun W. Ka
, Dongsu Han:
NarrAD: Automatic Generation of Audio Descriptions for Movies with Rich Narrative Context. 409-419 - Tung Luu, Nam Le, Duc Le, Bac Le:
From Visual Explanations to Counterfactual Explanations with Latent Diffusion. 420-429 - Yu-Yun Tseng, Tanusree Sharma, Lotus Zhang, Abigale Stangl, Leah Findlater, Yang Wang, Danna Gurari:
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments. 430-440 - Gayoon Choi, Taejin Jeong, Sujung Hong, Seong Jae Hwang:
Dragtext: Rethinking Text Embedding in Point-Based Image Editing. 441-450 - Brian B. Moser, Stanislav Frolov, Federico Raue, Sebastian Palacio, Andreas Dengel:
Dynamic Attention-Guided Diffusion for Image Super-Resolution. 451-460 - Shuang Chen, Haozheng Zhang, Amir Atapour-Abarghouei, Hubert P. H. Shum:
SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM. 461-471 - Rahul Sajnani, Jeroen van Baar, Jie Min, Kapil Katyal, Srinath Sridhar:
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models. 472-482 - Zitian Zhang, Frédéric Fortier-Chouinard, Mathieu Garon, Anand Bhattad, Jean-François Lalonde:
Zerocomp: Zero-Shot Object Compositing from Image Intrinsics via Diffusion. 483-494 - Diego Thomas, Briac Toussaint, Jean-Sébastien Franco, Edmond Boyer:
VortSDF: 3D Modeling with Centroidal Voronoi Tessellation on Signed Distance Field. 495-504 - Markus Plack, Hannah Dröge, Leif Van Holland, Matthias B. Hullin:
VHS: High-Resolution Iterative Stereo Matching with Visual Hull Priors. 505-514 - Ren Matsumoto, Takahiro Okabe, Ryo Kawahara:
Polarization as Texture: Microscale 3D Shape from Polarized Light Focus. 515-524 - Yuxin Huang, Andong Yang, Yuantao Chen, Runyi Yang, Zhenxin Zhu, Chao Hou, Hao Zhao, Guyue Zhou:
Self-Aligning Depth-Regularized Radiance Fields for Asynchronous RGB-D Sequences. 525-534 - Henrique Piñeiro Monteagudo
, Leonardo Taccari, Aurel Pjetri, Francesco Sambo, Samuele Salti:
RendBEV: Semantic Novel View Synthesis for Self-Supervised Bird's Eye View Segmentation. 535-544 - Yujing Sun, Caiyi Sun, Yuan Liu, Yuexin Ma, Siu Ming Yiu:
Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching. 545-556 - Chetan Madan, Mayuna Gupta, Soumen Basu, Pankaj Gupta, Chetan Arora:
LQ-Adapter: ViT-Adapter with Learnable Queries for Gallbladder Cancer Detection from Ultrasound Images. 557-567 - Xiwei Liu, Mohamad Kassab, Min Xu, Qirong Ho:
J-Invariant Volume Shuffle for Self-Supervised Cryo-Electron Tomogram Denoising on Single Noisy Volume. 568-577 - Daniel Khalil, Christina Liu, Pietro Perona, Jennifer J. Sun, Markus Marks:
Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision. 578-588 - Nirhoshan Sivaroopan, Chamuditha Jayanga Galappaththige, Chalani Ekanayake, Hasindri Watawana, Ranga Rodrigo, Chamira U. S. Edussooriya, Dushan N. Wadduwage:
Uncertainty Awareness Enables Efficient Labeling for Cancer Subtyping in Digital Pathology. 589-598 - Hyeongmin Park, Sungrae Hong, Chanjae Song, Jongwoo Kim, Mun Yong Yi:
Uncertainty-based Data-wise Label Smoothing for Calibrating Multiple Instance Learning in Histopathology Image Classification. 599-608 - Huimin Zeng, Jiacheng Li, Ziqiang Zheng, Zhiwei Xiong:
All-in-One Image Compression and Restoration. 609-619 - Sourajit Saha, Tejas Gokhale:
Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling. 620-629 - Pritam Karmokar, Quan H. Nguyen, William J. Beksi:
Secrets of Edge-Informed Contrast Maximization for Event-Based Vision. 630-639 - Sangwon Lee, Myungsub Choi, Nagyeong Lee, Hyong-Euk Lee:
Stable Autofocus with Focal Consistency Loss. 640-649 - Ashish Tiwari, Mihirkumar Sutariya, Shanmuganathan Raman:
LIPIDS: Learning-based Illumination Planning In Discretized (Light) Space for Photometric Stereo. 650-659 - Jiancheng Huang, Yi Huang, Jianzhuang Liu, Donghao Zhou, Yifan Liu, Shifeng Chen:
Dual-Schedule Inversion: Training- and Tuning-Free Inversion for Real Image Editing. 660-669 - Sanuwani Dayarathna, Kh Tohidul Islam
, Bohan Zhuang, Guang Yang, Jianfei Cai, Meng Law, Zhaolin Chen:
McCaD: Multi-Contrast MRI Conditioned, Adaptive Adversarial Diffusion Model for High-Fidelity MRI Synthesis. 670-679 - Kyungri Park, Woohwan Jung:
Improving Detail in Pluralistic Image Inpainting with Feature Dequantization. 680-689 - Kyungmin Jo, Jaegul Choo:
Skip-and-Play: Depth-Driven Pose-Preserved Image Generation for Any Objects. 690-699 - Arya Bangun, Zhuo Cao, Alessio Quercia, Hanno Scharr, Elisabeth Pfaehler:
MRI Reconstruction with Regularized 3D Diffusion Model (R3DM). 700-710 - Chengjie Huang, Vahdat Abdelzad, Sean Sedwards, Krzysztof Czarnecki:
VADet: Multi-Frame LiDAR 3D Object Detection Using Variable Aggregation. 711-720 - Gursimran Singh, Tianxi Hu, Mohammad Akbari, Qiang Tang, Yong Zhang:
Towards Secure and Usable 3D Assets: A Novel Framework for Automatic Visible Watermarking. 721-730 - Xinyue Wei, Fanbo Xiang, Sai Bi, Anpei Chen, Kalyan Sunkavalli, Zexiang Xu, Hao Su:
NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support. 731-741 - Decai Chen, Brianne Oberson, Ingo Feldmann, Oliver Schreer, Anna Hilsmann, Peter Eisert:
Adaptive and Temporally Consistent Gaussian Surfels for Multi-View Dynamic Reconstruction. 742-752 - Gonzalo Martin Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans, Bastian Leibe:
Fine-Tuning Image-Conditional Diffusion Models is Easier than you Think. 753-762 - Mahdi Alehdaghi, Pourya Shamsolmoali, Rafael M. O. Cruz, Eric Granger:
Bidirectional Multi-Step Domain Generalization for Visible-Infrared Person Re-Identification. 763-773 - Jiahao Luo, Jing Liu, James Davis:
SplatFace: Gaussian Splat Face Reconstruction Leveraging an Optimizable Surface. 774-783 - Jui-Che Chiang, Hou-Ning Hu, Bo-Syuan Hou, Chia-Yu Tseng, Yu-Lun Liu, Min-Hung Chen, Yen-Yu Lin:
ORFormer: Occlusion-Robust Transformer for Accurate Facial Landmark Detection. 784-793 - Rohit Lal, Saketh Bachu, Yash Garg, Arindam Dutta, Calvin-Khang Ta, Hannah Dela Cruz, Dripta S. Raychaudhuri, M. Salman Asif, Amit K. Roy-Chowdhury:
STRIDE: Single-Video Based Temporally Continuous Occlusion-Robust 3D Pose Estimation. 794-803 - Kartik Narayan, Nithin Gopalakrishnan Nair, Jennifer Xu, Rama Chellappa, Vishal M. Patel:
PETALface: Parameter Efficient Transfer Learning for Low-Resolution Face Recognition. 804-814 - Zengqun Zhao, Yu Cao, Shaogang Gong, Ioannis Patras:
Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer. 815-824 - Elaine Sui, Xiaohan Wang, Serena Yeung-Levy:
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models. 825-835 - Leon Sick, Dominik Engel, Pedro Hermosilla, Timo Ropinski:
Attention-Guided Masked Autoencoders for Learning Image Representations. 836-846 - Donghyeon Kwon, Inho Kim, Suha Kwak:
Boosting Semi-Supervised Video Action Detection with Temporal Context. 847-858 - Gabriele Spadaro, Marco Grangetto, Attilio Fiandrotti, Enzo Tartaglione, Jhony H. Giraldo:
WiGNet: Windowed Vision Graph Neural Network. 859-868 - Fei Wu, Pablo Márquez-Neila, Hedyeh Rafii-Tari, Raphael Sznitman:
Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation. 869-878 - Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu:
DeepMIM: Deep Supervision for Masked Image Modeling. 879-888 - Surojit Saha, Sarang C. Joshi, Ross T. Whitaker:
ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders. 889-898 - Wonjun Kang, Kevin Galim
, Hyung Il Koo, Nam Ik Cho:
Counting Guidance for High Fidelity Text-to-Image Synthesis. 899-908 - Rui Xu, Mengya Hu, Deren Lei, Yaxi Li, David Lowe, Alex Gorevski, Mingyu Wang, Emily Ching, Alex Deng:
InvisMark: Invisible and Robust Watermarking for AI-generated Image Provenance. 909-918 - Zhiyuan Xu, Yinhe Chen, Huan-ang Gao, Weiyan Zhao, Guiyu Zhang, Hao Zhao:
Diffusion-based Visual Anagram as Multi-task Learning. 919-928 - Ashutosh Srivastava, Tarun Ram Menta, Abhinav Java, Avadhoot Jadhav, Silky Singh
, Surgan Jandial, Balaji Krishnamurthy:
REEDIT: Multimodal Exemplar-Based Image Editing. 929-939 - Tanvir Mahmud, Mustafa Munir, Radu Marculescu, Diana Marculescu
:
Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior. 940-949 - Bo Lang, Mooi Choo Chuah:
Event-Guided Fusion-Mamba for Context-Aware 3D Human Pose Estimation. 950-960 - Luchao Qi
, Jiaye Wu, Annie N. Wang, Shengze Wang, Roni Sengupta:
My3DGen: A Scalable Personalized 3D Generative Model. 961-972 - Ashkan Ganj, Hang Su, Tian Guo:
HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image Priors. 973-982 - Keon Moradi, Ethan Haque, Jasmeen Kaur, Alexandra B. Bentz
, Eli S. Bridge, Golnaz Habibi
:
Context-Aware Outlier Rejection for Robust Multi-View 3D Tracking of Similar Small Birds in An Outdoor Aviary. 983-991 - Stathis Galanakis, Alexandros Lattas, Stylianos Moschoglou, Stefanos Zafeiriou:
FitDiff: Robust Monocular 3D Facial Shape and Reflectance Estimation using Diffusion Models. 992-1004 - Junyi Cao, Chao Ma:
Towards Generalized Face Anti-Spoofing from a Frequency Shortcut View. 1005-1015 - Marco Huber, Naser Damer:
Beyond Spatial Explanations: Explainable Face Recognition in the Frequency Domain. 1016-1026 - Diana Voth, Leonidas Dane, Jonas Grebe, Sebastian Peitz
, Philipp Terhörst:
Effective Backdoor Learning on Open-Set Face Recognition Systems. 1027-1039 - Han-Wei Kung, Tuomas Varanka, Sanjay Saha, Terence Sim, Nicu Sebe
:
Face Anonymization Made Simple. 1040-1050 - Yuxiang Guo, Anshul Shah, Jiang Liu, Ayush Gupta, Rama Chellappa, Cheng Peng:
GaitContour: Efficient Gait Recognition Based on a Contour-Pose Representation. 1051-1061 - Sanoojan Baliah, Qinliang Lin, Shengcai Liao, Xiaodan Liang, Muhammad Haris Khan:
Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models. 1062-1071 - Rui Li, Martin Trapp
, Marcus Klasson, Arno Solin
:
Flatness Improves Backbone Generalisation in Few-Shot Classification. 1072-1089 - Andrea Alfarano, Alberto Alfarano, Linda Friso, Andrea Bacciu, Irene Amerini, Fabrizio Silvestri:
STLight: A Fully Convolutional Approach for Efficient Predictive Learning by Spatio-Temporal Joint Processing. 1090-1100 - Do Huu Dat, Po Yuan Mao, Tien Hoang Nguyen, Wray L. Buntine, Mohammed Bennamoun
:
HOPE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts. 1101-1110 - Kiran Kokilepersaud, Seulgi Kim, Mohit Prabhushankar, Ghassan AlRegib:
HEX: Hierarchical Emergence Exploitation in Self-Supervised Algorithms. 1111-1121 - Yunbei Zhang, Akshay Mehra, Jihun Hamm:
OT-VP: Optimal Transport-Guided Visual Prompting for Test-Time Adaptation. 1122-1132 - Marcelo Sanchez, Gil Triginer, Coloma Ballester, Ignacio Sarasua, Lara Raad:
A New Benchmark and Baseline for Real-Time High-Resolution Image Inpainting on Edge Devices. 1133-1143 - Mrinal Verghese, Brian Chen, Hamid Eghbalzadeh, Tushar Nagarajan, Ruta Desai:
User-in-the-Loop Evaluation of Multimodal LLMs for Activity Assistance. 1144-1154 - Yan-Bo Lin, Yu Tian, Linjie Yang, Gedas Bertasius, Heng Wang:
VMAs: Video-to-Music Generation via Semantic Alignment in Web Music Videos. 1155-1165 - Daeyoung Roh, Donghee Han, Jihyun Nam, Jungsoo Oh, Youngbin You, Jeongheon Park, Mun Yong Yi:
CTIP: Towards Accurate Tabular-to-Image Generation for Tire Footprint Generation. 1166-1175 - Younghyun Cho, Changhun Lee, Seonggon Kim, Eunhyeok Park:
PTQ4VM: Post-Training Quantization for Visual Mamba. 1176-1185 - Jay N. Paranjape, Celso de Melo, Vishal M. Patel:
A Mamba-Based Siamese Network for Remote Sensing Change Detection. 1186-1196 - Julian D. Santamaria, Claudia Isaza, Jhony H. Giraldo:
CATALOG: A Camera Trap Language-Guided Contrastive Learning Model. 1197-1206 - Faith M. Johnson, Ryan Meegan, Jack Lowry, Peter Oudemans, Kristin J. Dana:
Agtech Framework for Cranberry-Ripening Analysis Using Vision Foundation Models. 1207-1216 - Feng Chen, Sotirios A. Tsaftaris, Mario Valerio Giuffrida:
GMT: Guided Mask Transformer for Leaf Instance Segmentation. 1217-1226 - Shirin Qiam, Saipraneeth Devunuri, Lewis J. Lehe:
A Pipeline and NIR-Enhanced Dataset for Parking Lot Segmentation. 1227-1236 - Shao-Hao Lu, Ren Wang, Ching-Chun Huang, Wei-Chen Chiu:
Boosting Diffusion Guidance via Learning Degradation-Aware Models for Blind Super Resolution. 1237-1246 - Antoine Mercier, Ramin Nakhli, Mahesh Reddy, Rajeev Yasarla, Hong Cai, Fatih Porikli, Guillaume Berger:
HexaGen3D: StableDiffusion is One Step Away from Fast and Diverse Text-to-3D Generation. 1247-1257 - Ali Mollaahmadi Dehaghi, Reza Razavi, Mohammad Moshirpour:
Reversing the Damage: A QP-Aware Transformer-Diffusion Approach for 8K Video Restoration under Codec Compression. 1258-1267 - Jianyi Zhang, Yufan Zhou, Jiuxiang Gu, Curtis Wigington, Tong Yu, Yiran Chen, Tong Sun, Ruiyi Zhang:
ARTIST: Improving the Generation of Text-Rich Images with Disentangled Diffusion Models and Large Language Models. 1268-1278 - Lorenzo Mandelli, Stefano Berretti:
Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models. 1279-1288 - S. Divakar Bhat, Amit More, Mudit Soni, Surbhi Agrawal:
Prior2Posterior: Model Prior Correction for Long-Tailed Learning. 1289-1298 - Prafful Kumar Khoba, Zijian Wang
, Chetan Arora, Mahsa Baktashmotlagh
:
Feature Space Perturbation: A Panacea to Enhanced Transferability Estimation. 1299-1308 - Hayeong Yu, Seungjae Han, Young-Gyu Yoon:
Design Principles of Multi-Scale J-Invariant Networks for Self-Supervised Image Denoising. 1309-1318 - Simon Damm, Mike Laszkiewicz, Johannes Lederer, Asja Fischer:
AnomalyDINO: Boosting Patch-based Few-Shot Anomaly Detection with DINOv2. 1319-1329 - Abu Zahid Bin Aziz, Mokshagna Sai Teja Karanam, Tushar Kataria, Shireen Y. Elhabian:
EFFICIENTMORPH: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration. 1330-1341 - Wenxin Ma, Qingsong Yao, Xiang Zhang, Zhelong Huang, Zihang Jiang, S. Kevin Zhou:
Towards Accurate Unified Anomaly Segmentation. 1342-1352 - Junhyeong Go, Jongbin Ryu:
Channel Propagation Networks for Refreshable Vision Transformer. 1353-1362 - Muhammad Ali, Mamoona Javaid, Mubashir Noman, Mustansar Fiaz, Salman H. Khan:
COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes. 1363-1372 - Lucas Deregnaucourt, Hind Laghmara, Alexis Lechervy, Samia Ainouz:
A Conflict-Guided Evidential Multimodal Fusion for Semantic Segmentation. 1373-1382 - Monika Kwiatkowski, Simon Matern, Olaf Hellwich:
Swin-∇: Gradient-Based Image Restoration from Image Sequences using Video Swin-Transformers. 1383-1391 - Gautier Evennou, Antoine Chaffin, Vivien Chappelier, Ewa Kijak:
Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation. 1392-1402 - Mohammad Reza Taesiri, Cor-Paul Bezemer:
Videogamebunny: Towards Vision Assistants for Video Games. 1403-1413 - Aiyu Cui, Jay Mahajan, Viraj Shah, Preeti Gomathinayagam, Chang Liu, Svetlana Lazebnik:
Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images. 1414-1423 - Jeya Maria Jose Valanarasu, Rahul Garg, Andeep Toor, Xin Tong, Weijuan Xi, Andreas Lugmayr, Vishal M. Patel, Anne Menini:
ReBotNet: Fast Real-Time Video Enhancement. 1424-1435 - Souhail Bakkali, Sanket Biswas, Zuheng Ming, Mickaël Coustaty, Marçal Rusiñol, Oriol Ramos Terrades, Josep Lladós:
GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification. 1436-1446 - Yizhou Wang, Kuan-Chuan Peng, Yun Fu:
Towards Zero-shot 3D Anomaly Localization. 1447-1456 - Seoungyoon Kang, Youngsun Lim, Hyunjung Shim:
Label-Augmented Dataset Distillation. 1457-1466 - Surbhi Madan, Shreya Ghosh, Lownish Rai Sookha, M. A. Ganaie, Ramanathan Subramanian
, Abhinav Dhall, Tom Gedeon:
MIP-GAF: A MLLM-Annotated Benchmark for Most Important Person Localization and Group Context Understanding. 1467-1476 - Paritosh Parmar, Eric Peh, Basura Fernando:
Learning to Visually Connect Actions and Their Effects. 1477-1487 - Keren Ganon, Morris Alper, Rachel Mikulinsky, Hadar Averbuch-Elor:
WAFFLE: Multimodal Floorplan Understanding in the Wild. 1488-1497 - Dac Thai Nguyen, Trung Thanh Nguyen, Huu Tien Nguyen, Thanh Trung Nguyen, Huy Hieu Pham, Thanh Hung Nguyen, Truong Thao Nguyen, Phi Le Nguyen:
CT to PET Translation: A Large-Scale Dataset and Domain-Knowledge-Guided Diffusion Approach. 1498-1507 - Jiahao Xu
, Zikai Zhang
, Rui Hu:
Achieving Byzantine-Resilient Federated Learning via Layer-Adaptive Sparsified Model Aggregation. 1508-1517 - Jung Im Choi
, Qizhen Lan, Qing Tian:
Improving Deep Detector Robustness via Detection-Related Discriminant Maximization and Reorganization. 1518-1527 - Rambod Azimi, Yijian Kong, Dusan Gostimirovic, James J. Clark, Odile Liboiron-Ladouceur:
SEMU-Net: A Segmentation-Based Corrector for Fabrication Process Variations of Nanophotonics with Microscopic Images. 1528-1536 - Seonguk Seo, Mustafa Gökhan Uzunbas, Bohyung Han, Sara Cao, Ser-Nam Lim:
Metric Compatible Training for Online Backfilling in Large-Scale Retrieval. 1537-1545 - Dimitrios Sinodinos
, Narges Armanfard:
Cross-Task Affinity Learning for Multitask Dense Scene Predictions. 1546-1555 - Sourasekhar Banerjee, Debaditya Roy, Vigneshwaran Subbaraju, Monowar H. Bhuyan:
Predicting Event Memorability Using Personalized Federated Learning. 1556-1565 - Hamidreza Dastmalchi, Aijun An
, Ali Cheraghian, Shafin Rahman, Sameera Ramasinghe:
Test-Time Adaptation of 3D Point Clouds via Denoising Diffusion Models. 1566-1576 - Dan-Sebastian Bacea, Florin Oniga:
ECF-YOLOv7-Tiny: Improving Feature Fusion and the Receptive Field for Lightweight Object Detectors. 1577-1586 - Giulia Rizzoli, Matteo Caligiuri, Donald Shenaj
, Francesco Barbato, Pietro Zanuttigh:
When Cars Meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather. 1587-1596 - Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, Babak Taati:
SUM: Saliency Unification Through Mamba for Visual Attention Modeling. 1597-1607 - Nyle Siddiqui, Florinel-Alin Croitoru, Gaurav Kumar Nayak, Radu Tudor Ionescu, Mubarak Shah:
DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-Id. 1608-1617 - Kuan-Hung Liu, Cheng-Kun Yang, Min-Hung Chen, Yu-Lun Liu, Yen-Yu Lin:
CorrFill: Enhancing Faithfulness in Reference-Based Inpainting with Correspondence Guidance in Diffusion Models. 1618-1627 - Shuo Wang
, Chunlong Xia, Feng Lv
, Yifeng Shi:
RT-DETRv3: Real-Time End-to-End Object Detection with Hierarchical Dense Positive Supervision. 1628-1636 - Elham Amin Mansour, Ozan Unal, Suman Saha, Benjamín Béjar, Luc Van Gool:
Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation. 1637-1648 - Saheli Hazra, Sudip Das, Rohit Choudhary, Arindam Das, Ganesh Sistu, Ciarán Eising, Ujjwal Bhattacharya:
Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure. 1649-1659 - Maciej K. Wozniak, Hariprasath Govindarajan, Marvin Klingner, Camille Maurice, Ravi Kiran, Senthil Kumar Yogamani:
S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving. 1660-1670 - Rémy Sun, Li Yang, Diane Lingrand, Frédéric Precioso:
Mind the Map! Accounting for Existing Maps When Estimating Online HDMaps from Sensors. 1671-1681 - Adrien Lafage, Mathieu Barbier, Gianni Franchi, David Filliat:
Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting. 1682-1691 - Chaesong Park, Eunbin Seo, Jongwoo Lim:
HeightLane: BEV Heightmap Guided 3D Lane Detection. 1692-1701 - Roy Uziel, Oded Bialer:
Optimizing Vision-Language Model for Road Crossing Intention Estimation. 1702-1712 - Xiaoyu Zhang, Ziwei Wang, Hai Dong
, Zhifeng Bao, Jiajun Liu:
On-the-Fly Object-aware Representative Point Selection in Point Cloud. 1713-1722 - Nikos Efthymiadis, Bill Psomas, Zakaria Laskar, Konstantinos Karantzalos, Yannis Avrithis, Ondrej Chum, Giorgos Tolias:
Composed Image Retrieval for Training-FREE DOMain Conversion. 1723-1733 - Zifu Wan, Pingping Zhang, Yuhao Wang, Silong Yong, Simon Stepputtis, Katia P. Sycara, Yaqi Xie:
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation. 1734-1744 - Hanoona Abdul Rasheed, Muhammad Maaz, Abdelrahman M. Shaker, Salman H. Khan, Hisham Cholakkal, Rao Muhammad Anwer, Tim Baldwin, Michael Felsberg, Fahad Shahbaz Khan:
Palo: A Polyglot Large Multimodal Model for 5B People. 1745-1754 - Anjishnu Mukherjee, Ziwei Zhu
, Antonios Anastasopoulos:
Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models. 1755-1764 - Srikumar Sastry, Subash Khanal, Aayush Dhakal, Adeel Ahmad, Nathan Jacobs:
TaxaBind: A Unified Embedding Space for Ecological Applications. 1765-1774 - Qianyi Liu, Siqi Zhang, Yanyuan Qiao
, Junyou Zhu, Xiang Li, Longteng Guo, Qunbo Wang, Xingjian He, Qi Wu, Jing Liu:
GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation. 1775-1784 - Florian Hofherr
, Bjoern Haefner, Daniel Cremers:
On Neural BRDFs: A Thorough Comparison of State-of-the-Art Approaches. 1785-1794 - Leif Van Holland, Michael Weinmann
, Jan U. Müller, Patrick Stotko, Reinhard Klein:
NeRFs are Mirror Detectors: Using Structural Similarity for Multi-View Mirror Scene Reconstruction with 3D Surface Primitives. 1795-1807 - Hugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic:
RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis. 1808-1817 - Konstantinos Tzevelekakis, Shutong Zhang, Luc Van Gool, Christos Sakaridis:
Sun Off, Lights on: Photorealistic Monocular Nighttime Simulation for Robust Semantic Perception. 1818-1828 - Kengo Matsufuji, Lin Shi, Ryo Kawahara, Takahiro Okabe:
Separating Direct and Global Components from Novel Viewpoints. 1829-1838 - Tianshu Kuai, Sina Honari, Igor Gilitschenski, Alex Levinshtein:
Towards Unsupervised Blind Face Restoration Using Diffusion Prior. 1839-1849 - Naeun Ko, Yonghyun Jeong, Jong Chul Ye:
Text-to-Image Synthesis for Domain Generalization in Face Anti-Spoofing. 1850-1860 - Huawei Sun, Zixu Wang, Hao Feng, Julius Ott, Lorenzo Servadei, Robert Wille:
GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling. 1850-1860 - Bin Yan, Martin Sundermeyer, David Joseph Tan, Huchuan Lu, Federico Tombari:
Towards Real-Time Open-Vocabulary Video Instance Segmentation. 1861-1871 - Hakjin Lee, Minki Song
, Jamyoung Koo, Junghoon Seo:
Hausdorff Distance Matching with Adaptive Query Denoising for Rotated Detection Transformer. 1872-1882 - Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum:
Crafting Distribution Shifts for Validation and Training in Single Source Domain Generalization. 1883-1892 - Abbas Khan, Muhammad Asad, Martin Benning, Caroline H. Roney, Gregory G. Slabaugh:
CAMS: Convolution and Attention-Free Mamba-Based Cardiac Image Segmentation. 1893-1903 - Wenhao Gu, Li Gu, Ziqiang Wang, Ching Yee Suen, Yang Wang:
DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning. 1904-1913 - Wulin Xie, Lian Zhao, Jiang Long, Xiaohuan Lu, Bingyan Nie:
Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification. 1914-1923 - Shahriar Rifat, Jonathan D. Ashdown, Francesco Restuccia:
DARDA: Domain-Aware Real-Time Dynamic Neural Network Adaptation. 1924-1932 - Hidehisa Arai, Keita Miwa, Kento Sasaki, Kohei Watanabe, Yu Yamaguchi, Shunsuke Aoki, Issei Yamamoto:
CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving. 1933-1943 - Alloy Das, Sanket Biswas, Prasun Roy, Subhankar Ghosh, Umapada Pal, Michael Blumenstein, Josep Lladós, Saumik Bhattacharya:
FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework. 1944-1954 - Sreetama Sarkar, Gourav Datta, Souvik Kundu, Kai Zheng, Chirayata Bhattacharyya, Peter A. Beerel:
MaskVD: Region Masking for Efficient Video Object Detection. 1955-1964 - Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu:
Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding. 1965-1978 - Zhonghua Yi, Hao Shi, Qi Jiang, Kailun Yang, Ze Wang, Diyang Gu, Yufan Zhang, Kaiwei Wang:
EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data. 1979-1988 - Michael Schwingshackl, Fabio Francisco Oberweger
, Markus Murschitz
:
Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks. 1989-1998 - Jiange Yang, Wenhui Tan, Chuhao Jin, Keling Yao, Bei Liu, Jianlong Fu, Ruihua Song, Gangshan Wu, Limin Wang:
Transferring Foundation Models for Generalizable Robotic Manipulation. 1999-2010 - Raktim Gautam Goswami, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami:
FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training. 2011-2020 - Jianhao Zheng, Gábor Valasek, Daniel Barath, Iro Armeni:
Multi-HexPlanes: A Lightweight Map Representation for Rendering and 3D Reconstruction. 2021-2031 - Lin Shi, Kengo Matsufuji, Ryo Kawahara, Takahiro Okabe:
FluoNeRF: Fluorescent Novel-View Synthesis Under Novel Light Source Colors. 2032-2041 - Eito Ikuta, Yohan Lee, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka:
Harmonizing Attention: Training-free Texture-aware Geometry Transfer. 2042-2051 - Chengyang Yan, Donald G. Dansereau:
TaCOS: Task-Specific Camera Optimization with Simulation. 2052-2062 - Daiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka:
Negative-Prompt Inversion: Fast Image Inversion for Editing with Text-Guided Diffusion Models. 2063-2072 - Stanislav Frolov, Brian B. Moser, Andreas Dengel:
SpotDiffusion: A Fast Approach for Seamless Panorama Generation Over Time. 2073-2081 - Prajneya Kumar, Eshika Khandelwal, Makarand Tapaswi, Vishnu Sreekumar:
Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability. 2082-2091 - Akshita Gupta, Gaurav Mittal, Ahmed Magooda, Ye Yu, Graham W. Taylor, Mei Chen:
LoSA: Long-Short-Range Adapter for Scaling End-to-End Temporal Action Localization. 2092-2102 - Minghui Lin, Shu Wang, Xiang Wang, Jianhua Tang, Longbin Fu, Zhengrong Zuo, Nong Sang:
DMPT: Decoupled Modality-Aware Prompt Tuning for Multi-Modal Object Re-Identification. 2103-2112 - Rita Pucci, Niki Martinel:
CE-VAE: Capsule Enhanced Variational AutoEncoder for Underwater Image Enhancement. 2113-2123 - Logan Servant, Michaël Clément, Laurent Wendling, Camille Kurtz:
Contrastive Learning of Image Representations Guided by Spatial Relations. 2124-2133 - Katharina Prasse
, Isaac Bravo, Stefanie Walter, Margret Keuper:
I Spy with My Little Eye a Minimum Cost Multicut Investigation of Dataset Frames. 2134-2143 - Jingbo Zeng, Zaiwang Gu, Weide Liu, Lile Cai, Jun Cheng:
Uncertainty Aware Interest Point Detection and Description. 2144-2153 - Jiawei Yao, Jusheng Zhang, Xiaochao Pan, Tong Wu, Canran Xiao:
DepthSSC: Monocular 3D Semantic Scene Completion via Depth-Spatial Alignment and Voxel Adaptation. 2154-2163 - Yongkang Cheng, Mingjiang Liang, Shaoli Huang, Gaoge Han, Jifeng Ning, Wei Liu:
Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios. 2164-2173 - Chen Zhao
, Mengyuan Yu, Fan Yang, Peiguang Jing:
VIIS: Visible and Infrared Information Synthesis for Severe Low-Light Image Enhancement. 2174-2184 - Saad Lahlali, Nicolas Granger, Hervé Le Borgne, Quoc-Cuong Pham:
ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only. 2185-2194 - Rao Fu, Jingyu Liu, Xilun Chen, Yixin Nie, Wenhan Xiong:
Scene-LLM: Extending Language Model for 3D Visual Reasoning. 2195-2206 - Tai D. Nguyen, Matthew C. Stamm:
MVFNet: Multipurpose Video Forensics Network using Multiple Forms of Forensic Evidence. 2207-2217 - Gaoge Han, Mingjiang Liang, Jinglei Tang, Yongkang Cheng, Wei Liu, Shaoli Huang:
ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model. 2218-2227 - Shaoxiang Wang, Yaxu Xie, Chun-Peng Chang, Christen Millerdurai, Alain Pagani, Didier Stricker:
Uni-SLAM: Uncertainty-Aware Neural Implicit SLAM for Real-Time Dense Indoor Scene Reconstruction. 2228-2239 - Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Jürgen Gall, Amirhossein Habibian:
Valid: Variable-Length Input Diffusion for Novel View Synthesis. 2240-2249 - Florian Chabot, Nicolas Granger, Guillaume Lapouge:
GaussianBeV: 3D Gaussian Representation meets Perception Models for BeV Segmentation. 2250-2259 - Longwei Li, Huajian Huang, Sai-Kit Yeung, Hui Cheng:
OmniGS: Fast Radiance Field Reconstruction Using Omnidirectional Gaussian Splatting. 2260-2268 - Tao Tu, Ming-Feng Li, Chieh Hubert Lin, Yen-Chi Cheng, Min Sun, Ming-Hsuan Yang:
DreaMo: Articulated 3D Reconstruction from a Single Casual Video. 2269-2279 - Danush Kumar Venkatesh, Dominik Rivoir, Micha Pfeiffer, Fiona R. Kolbinger, Stefanie Speidel
:
Data Augmentation for Surgical Scene Segmentation with Anatomy-Aware Diffusion Models. 2280-2290 - Fotios Logothetis, Ignas Budvytis, Roberto Cipolla:
NPL-MVPS: Neural Point-Light Multi-View Photometric Stereo. 2291-2300 - Wenzhao Li, Tianhao Wu, Fangcheng Zhong, Cengiz Öztireli:
ARF-Plus: Controlling Perceptual Factors in Artistic Radiance Fields for 3D Scene Stylization. 2301-2310 - Sachin Raja, Ajoy Mandal, C. V. Jawahar:
Treading Towards Privacy-Preserving Table Structure Recognition. 2311-2321 - Tong Wei, Philipp Lindenberger, Jirí Matas, Daniel Barath:
Breaking the Frame: Visual Place Recognition by Overlap Prediction. 2322-2331 - G. Ujwal Sai, Arkadipta De, Vartika Sengar, Anuj Rathore, Daksh Thapar, Manohar Kaul:
Learning Semantic Part-Based Graph Structure for 3D Point Cloud Domain Generalization. 2332-2341 - Jiuxiang Gu, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song:
Differential Privacy Mechanisms in Neural Tangent Kernel Regression. 2342-2356 - Adith Boloor, Weikai Lin, Tianrui Ma
, Yu Feng, Yuhao Zhu, Xuan Zhang:
PrivateEye: In-Sensor Privacy Preservation Through Optical Feature Separation. 2357-2367 - Shogo Sato, Takuhiro Kaneko, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida, Akisato Kimura:
Unsupervised Single-Image Intrinsic Image Decomposition with LiDAR Intensity Enhanced Training. 2368-2378 - Benjamin Salmon
, Alexander Krull
:
Unsupervised Denoising for Signal-Dependent and Row-Correlated Imaging Noise. 2379-2389 - Chen Wu, Ling Wang, Long Peng, Dianjie Lu, Zhuoran Zheng:
Dropout the High-Rate Downsampling: A Novel Design Paradigm for UHD Image Restoration. 2390-2399 - Ankit Dhiman, R. Srinath, Srinjay Sarkar, Lokesh R. Boregowda, R. Venkatesh Babu:
ChromaDistill: Colorizing Monochrome Radiance Fields with Knowledge Distillation. 2400-2410 - Chaohao Xie, Kai Han, Kwan-Yee K. Wong:
VipDiff: Towards Coherent and Diverse Video Inpainting via Training-Free Denoising Diffusion Models. 2411-2420 - Matias Turkulainen, Xuqian Ren
, Iaroslav Melekhov, Otto Seiskari
, Esa Rahtu
, Juho Kannala:
DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing. 2421-2431 - Dongwoo Park, Suk Pil Ko:
NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior. 2432-2441 - Bo Ji, Angela Yao:
High-Pass Kernel Prediction for Efficient Video Deblurring. 2442-2452 - Guoshan Liu, Hailong Yin, Bin Zhu, Jingjing Chen, Chong-Wah Ngo, Yu-Gang Jiang:
Retrieval Augmented Recipe Generation. 2453-2463 - Nan Cai, Pia Bideau:
Active Event Alignment for Monocular Distance Estimation. 2464-2473 - Hojun Jang, Young Min Kim:
ReMP: Reusable Motion Prior for Multi-domain 3D Human Pose Estimation and Motion Inbetweening. 2474-2483 - Zhiyu Pan, Zhicheng Zhong, Wenxuan Guo, Yifan Chen, Jianjiang Feng, Jie Zhou:
LiCamPose: Combining Multi-View LiDAR and RGB Cameras for Robust Single-timestamp 3D Human Pose Estimation. 2484-2494 - Laura O'Mahony, Nikola S. Nikolov, David J. P. O'Sullivan:
Towards Utilising a Range of Neural Activations for Comprehending Representational Associations. 2495-2506 - Amit Giloni, Omer Hofman, Ikuya Morikawa, Toshiya Shimizu, Yuval Elovici, Asaf Shabtai:
DiL: An Explainable and Practical Metric for Abnormal Uncertainty in Object Detection. 2507-2516 - Dongyu Yan, Guanyu Huang, Fengyu Quan, Haoyao Chen:
MSI-NeRF: Linking Omni-Depth with View Synthesis Through Multi-Sphere Image Aided Generalizable Neural Radiance Field. 2517-2526 - Giacomo Capitani, Lorenzo Bonicelli, Angelo Porrello, Federico Bolelli, Simone Calderara, Elisa Ficarra:
Towards Unbiased Continual Learning: Avoiding Forgetting in the Presence of Spurious Correlations. 2527-2537 - Juhyeon Park, Seokhyeon Jeong, Taesup Moon:
TLDR: Text Based Last-Layer Retraining for Debiasing Image Classifiers. 2538-2547 - Vito Paolo Pastore, Massimiliano Ciranni
, Davide Marinelli, Francesca Odone, Vittorio Murino:
Looking at Model Debiasing through the Lens of Anomaly Detection. 2548-2557 - Mingqi Shao, Feng Xiong, Hang Zhang, Shuang Yang, Mu Xu, Wei Bian, Xueqian Wang:
Global-Guided Focal Neural Radiance Field for Large-Scale Scene Rendering. 2558-2567 - Weijing Tao
, Biwen Lei, Kunhao Liu, Shijian Lu, Miaomiao Cui, Xuansong Xie:
DivAvatar: Diverse 3D Avatar Generation with a Single Prompt. 2568-2577 - Ugo Leone Cavalcanti, Matteo Poggi, Fabio Tosi, Valerio Cambareri, Vladimir Zlokolica, Stefano Mattoccia:
CabNIR: A Benchmark for In-Vehicle Infrared Monocular Depth Estimation. 2578-2590 - Muhammad Salman Ali
, Sung-Ho Bae, Enzo Tartaglione:
ELMGS: Enhancing Memory and Computation Scalability Through coMpression for 3D Gaussian Splatting. 2591-2600 - Matías Mendieta, Guangyu Sun, Chen Chen:
Navigating Heterogeneity and Privacy in One-Shot Federated Learning with Diffusion Models. 2601-2610 - Feng Xu, David Ahmedt-Aristizabal, Lars Petersson, Dadong Wang, Xun Li:
Facial Expression Recognition with Controlled Privacy Preservation and Feature Compensation. 2611-2621 - Hermes McGriff, Renato Martins, Nicolas Andreff, Cédric Demonceaux:
Dense Scene Reconstruction from Light-Field Images Affected by Rolling Shutter. 2622-2630 - Shilin Hu, Hieu Le, ShahRukh Athar, Sagnik Das, Dimitris Samaras:
Shadow Removal Refinement via Material-Consistent Shadow Edges. 2631-2641 - Yiqing Liang, Numair Khan, Zhengqin Li, Thu Nguyen-Phuoc, Douglas Lanman, James Tompkin, Lei Xiao:
GauFRe: Gaussian Deformation Fields for Real-Time Dynamic Novel View Synthesis. 2642-2652 - Karam Park, Nam Ik Cho:
Partial Filter-Sharing: Improved Parameter-sharing Method for Single Image Super-Resolution Networks. 2653-2663 - Si-Yu Lu, Yung-Yao Chen, Yi-Tong Wu, Hsin-Chun Lin, Sin-Ye Jhong, Wen-Huang Cheng:
Radiance Field-Based Pose Estimation via Decoupled Optimization Under Challenging Initial Conditions. 2664-2673 - Yimu Wang, Krzysztof Czarnecki:
AiDe: Improving 3D Open-Vocabulary Semantic Segmentation by Aligned Vision-Language Learning. 2674-2685 - Yongjae Lee, Li Yang, Deliang Fan:
MFNeRF: Memory Efficient NeRF with Mixed-Feature Hash Table. 2686-2695 - Tu Vo, Chan Y. Park:
Deep Joint Unrolling for Deblurring and Low-Light Image Enhancement (JUDE). 2696-2705 - Hirunima Jayasekara, Khoi Pham, Nirat Saini, Abhinav Shrivastava:
Unified Framework for Open-World Compositional Zero-Shot Learning. 2706-2714 - Manh Duong Nguyen, Tuan Nghia Nguyen, Xuan Truong Nguyen:
ENAF: A Multi-Exit Network with an Adaptive Patch Fusion for Large Image Super Resolution. 2706-2714 - Yahan Chen
, Wenzheng Liu, Xiaowei Luo
:
Semantic Segmentation Method for Automated Indoor 3D Reconstruction based on Architectural-Knowledge-Aware Features. 2715-2724 - Asen Nachkov, Danda Pani Paudel, Martin Danelljan, Luc Van Gool:
Diffusion-Based Particle-DETR for BEV Perception. 2725-2735 - Aditya Dixit
, Nischit Hosamani, Puneet Gupta, Ankur Garg:
VISIONARY: Novel Spatial-Spectral Attention Mechanism for Hyperspectral Image Denoising. 2736-2745 - Yujing Xue, Jiaxiang Liu, Jiawei Du, Joey Tianyi Zhou:
PVP: Polar Representation Boost for 3D Semantic Occupancy Prediction. 2746-2755 - Han Zou, Masanori Suganuma, Takayuki Okatani:
RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution. 2756-2765 - Aimon Rahman, Malsha V. Perera, Vishal M. Patel:
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models. 2766-2776 - Gasser Elazab, Torben Gräber, Michael Unterreiner, Olaf Hellwich:
MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications. 2777-2787 - Alexandre Fournier-Montgieux, Michaël Soumm, Adrian Popescu, Bertrand Luvison, Hervé Le Borgne:
Fairer Analysis and Demographically Balanced Face Generation for Fairer Face Verification. 2788-2798 - Ziqiang Shi, Rujie Liu, Jun Takahashi, Takuma Yamamoto:
Bayesian Optimal Latent Projection for Noisy Image Restoration. 2799-2807 - Amartya Bhattacharya, Debarshi Brahma, Suraj Nagaje Mahadev, Anmol Asati, Vikas Verma, Soma Biswas:
Can Out-of-Domain Data Help to Learn Domain-Specific Prompts for Multimodal Misinformation Detection? 2808-2817 - Jiahui Li, Pourya Shamsolmoali, Yue Lu, Masoumeh Zareapoor:
ShapeMorph: 3D Shape Completion via Blockwise Discrete Diffusion. 2818-2827 - Inpyo Song, Sanghyeon Lee, Minjun Joo, Jangwon Lee:
Anomaly Detection for People with Visual Impairments Using an Egocentric 360-Degree Camera. 2828-2837 - Green Rosh K. S, Meghana Shankar, Prateek Kukreja, Anmol Namdev, B. H. Pawan Prasad:
XPose: Towards Extreme Low Light Hand Pose Estimation. 2838-2848 - Shaoxiong Zhang
, Hiromitsu Awano, Takashi Sato:
Gaitcloud: Leveraging Spatial-Temporal Information for Lidar-Base Gait Recognition With a True-3D Gait Representation. 2849-2858 - Federico Nocentini, Claudio Ferrari, Stefano Berretti:
EmoVOCA: Speech-Driven Emotional 3D Talking Heads. 2859-2868 - Hugo Porta, Emanuele Dalsasso, Diego Marcos
, Devis Tuia:
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation. 2869-2880 - Aleksandr Matsun, Numan Saeed, Fadillah Adamsyah Maani, Mohammad Yaqub:
ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization. 2881-2889 - Vivek Madhavaram, Shivangana Rawat, Chaitanya Devaguptapu, Charu Sharma, Manohar Kaul:
Towards a Training Free Approach for 3D Scene Editing. 2890-2899 - Leonard Bruns, Jun Zhang
, Patric Jensfelt:
Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration. 2900-2909 - Julian Kaltheuner, Patrick Stotko, Reinhard Klein:
ROSA: Reconstructing Object Shape and Appearance Textures by Adaptive Detail Transfer. 2910-2920 - Hossein Resani, Behrooz Nasihatkon, Mohammadreza Alimoradi Jazi:
Continual Learning in 3D Point Clouds: Employing Spectral Techniques for Exemplar Selection. 2921-2931 - Sanjay S. J, Akash J, Sreehari Rajan, Dimple A. Shajahan
, Charu Sharma:
Adversarial Learning Based Knowledge Distillation on 3D Point Clouds. 2932-2941 - Annie N. Wang, Luchao Qi, Roni Sengupta:
Continual Learning of Personalized Generative Face Models with Experience Replay. 2942-2951 - Jae Joong Lee, Bedrich Benes:
RGB2Point: 3D Point Cloud Generation from Single RGB Images. 2952-2962 - Thomas Walker, Octave Mariotti, Amir Vaxman, Hakan Bilen:
Spatially-Adaptive Hash Encodings for Neural Surface Reconstruction. 2963-2972 - Esmat Ghasemi Saghand, Susana K. Lai-Yuen:
MONAS-ESNN: Multi-Objective Neural Architecture Search for Efficient Spiking Neural Networks. 2963-2972 - Mingjiang Liang, Yongkang Cheng, Hualin Liang, Shaoli Huang, Wei Liu:
RopeTP: Global Human Motion Recovery via Integrating Robust Pose Estimation with Diffusion Trajectory Prior. 2973-2982 - Jiawei Liu, Wayne Lam, Zhigang Zhu, Hao Tang
:
SMDAF: A Scalable Sidewalk Material Data Acquisition Framework with Bidirectional Cross-Modal Knowledge Distillation. 2983-2992 - Anvita A. Srinivas, Tuomas P. Oikarinen, Divyansh Srivastava, Wei-Hung Weng, Tsui-Wei Weng:
SAND: Enhancing Open-Set Neuron Descriptions through Spatial Awareness. 2993-3002 - Shreya Saha, Zekai Liang, Shan Lin, Jingpei Lu, Michael C. Yip, Sainan Liu:
BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction Using Neural Radiance Fields. 3003-3012 - Chuanmao Fan, Chenxi Zhao, Ye Duan:
PVT: An Implicit Surface Reconstruction Framework via Point Voxel Geometric-Aware Transformer. 3013-3023 - Katherine Xu, Lingzhi Zhang, Jianbo Shi:
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models. 3024-3034 - Naga Venkata Sai Raviteja Chappa, Khoa Luu:
LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition. 3035-3044 - Zhiyuan Gao, Wenbin Teng, Gonglin Chen, Jinsen Wu, Ningli Xu, Rongjun Qin, Andrew Feng, Yajie Zhao:
Skyeyes: Ground Roaming using Aerial View Images. 3045-3054 - Tingting Zhao, Chenguang Liu, Kamal Jnawali, Chang Su:
eLIR-Net: an Efficient AI Solution for Image Retouching. 3055-3063 - Haojie Cai, Dongfu Yin, Fei Richard Yu, Siting Xiong:
DSTR: Dual Scenes Transformer for Cross-Modal Fusion in 3D Object Detection. 3064-3073 - Wangduo Xie, Richard Schoonhoven, Tristan van Leeuwen, Matthew B. Blaschko
:
AC-IND: Sparse CT Reconstruction Based on Attenuation Coefficient Estimation and Implicit Neural Distribution. 3074-3083 - Ziqi Gao, Wendi Yang, Yujia Li, Lei Xing, S. Kevin Zhou:
MS-Glance: Bio-Inspired Non-Semantic Context Vectors and Their Applications in Supervising Image Reconstruction. 3084-3095 - Ji Zhang, Yiran Ding, Zixin Liu:
OccLoff: Learning Optimized Feature Fusion for 3D Occupancy Prediction. 3096-3106 - Tung-Yu Wu, Sheng-Yu Huang, Yu-Chiang Frank Wang:
Data-Efficient 3D Visual Grounding via Order-Aware Referring. 3107-3117 - Brent Zoomers, Maarten Wijnants, Ivan Molenaers, Joni Vanherck
, Jeroen Put, Nick Michiels:
PRoGS: Progressive Rendering of Gaussian Splats. 3118-3127 - Junjie Oscar Yin, Ting Li, Jiahao Wang, Yi Zhang, Alan L. Yuille:
EasyRet3D: Uncalibrated Multi-View Multi-Human 3D Reconstruction and Tracking. 3128-3137 - Jingtong Yue, Xin Lin, Zijiu Yang, Chao Ren:
Dual-Representation Interaction Driven Image Quality Assessment with Restoration Assistance. 3138-3147 - Chen Feng, Duolikun Danier, Fan Zhang, Alex Mackin, Andrew Collins, David Bull:
MVAD: A Multiple Visual Artifact Detector for Video Streaming. 3148-3158 - Katharina Bendig, René Schuster, Nicole Thiemer, Karen Joisten, Didier Stricker:
Supplementary Material AnonyNoise: Anonymizing Event Data with Smart Noise to Outsmart Re-Identification and Preserve Privacy. 3159-3161 - Jiahuan Li, Xiaoyu Dong, Wei He, Naoto Yokoya:
Wavelength- and Depth-Aware Deep Image Prior for Blind Hyperspectral Imagery Deblurring with Coarse Depth Guidance. 3162-3171 - Md Motiur Rahman, Mohamed Trabelsi, Hüseyin Uzunalioglu, Aidan Boyd:
Personalized Mixture of Experts for Multi-Site Medical Image Segmentation. 3172-3184 - Maor Dikter, Tsachi Blau, Chaim Baskin:
Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency. 3185-3195 - Yilin Zheng, Chiang-Heng Chien, Ricardo Fabbri, Benjamin B. Kimia:
3D Edge Sketch from Multiview Images. 3196-3205 - Seonguk Seo, Dongwan Kim, Bohyung Han:
Revisiting Machine Unlearning with Dimensional Alignment. 3206-3215 - Arkadipta De, Vartika Sengar, Daksh Thapar, Mahesh Chandran, Manohar Kaul:
Elemental Composite Prototypical Network: Few-Shot Object Detection on Outdoor 3D Point Cloud Scenes. 3216-3226 - Nourhan Bayasi, Jamil Fayyad, Ghassan Hamarneh, Rafeef Garbi, Homayoun Najjaran:
Debiasify: Self-Distillation for Unsupervised Bias Mitigation. 3227-3236 - Haidong Wu, Snehal Bhayani, Janne Heikkilä:
A Conic Transformation Approach for Solving the Perspective-Three-Point Problem. 3237-3245 - Kunal Kathare, Ankit Dhiman, Vikas K. Gowda, Siddharth Aravindan, Shubham Monga, Basavaraja Shanthappa Vandrotti, Lokesh R. Boregowda:
Instructive3D: Editing Large Reconstruction Models with Text Instructions. 3246-3256 - Marco Garosi, Riccardo Tedoldi, Davide Boscaini
, Massimiliano Mancini
, Nicu Sebe
, Fabio Poiesi:
3D Part Segmentation via Geometric Aggregation of 2D Visual Features. 3257-3267 - Kunal Chelani, Assia Benbihi, Torsten Sattler, Fredrik Kahl:
EdgeGaussians - 3D Edge Mapping via Gaussian Splatting. 3268-3279 - Haoran Wang, Nantheera Anantrasirichai, Fan Zhang, David Bull:
UW-GS: Distractor-Aware 3D Gaussian Splatting for Enhanced Underwater Scene Reconstruction. 3280-3289 - Mohammad Farazi, Yalin Wang:
A Recipe for Geometry-Aware 3D Mesh Transformers. 3290-3300 - Kurt H. W. Stolle
:
Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation. 3301-3309 - Michal Byra, Henrik Skibbe:
Generating Visual Explanations from Deep Networks Using Implicit Neural Representations. 3310-3319 - Youpeng Wen, Yi Zhu, Zhihao Zhan, Pengzhen Ren, Jianhua Han, Hang Xu, Shen Zhao, Xiaodan Liang:
DisCo: Discovering Common Affordance from Large Models for Actionable Part Perception. 3320-3329 - Zhen Yao
, Mooi Choo Chuah:
Event-Guided Low-Light Video Semantic Segmentation. 3330-3341 - Masahiro Yamaguchi, Takashi Shibata, Shoji Yachida, Keiko Yokoyama, Toshinori Hosoi:
MDCN-PS: Monocular-Depth-Guided Coarse Normal Attention for Robust Photometric Stereo. 3342-3351 - Eui Jun Hwang, Sukmin Cho, Huije Lee, Youngwoo Yoon, Jong C. Park:
A Spatio-Temporal Representation Learning as an Alternative to Traditional Glosses in Sign Language Translation and Production. 3352-3362 - Devendra Patel, Vikas Verma, Shreyas Kumar Tah, Shwetabh Biswas, Soma Biswas:
FRAUD-Net: Fraud News Detection Using Sample Uncertainty & Domain Aware Generalized Network. 3363-3371 - Priyanka Mishra, Nancy Mehta, Santosh Kumar Vipparthi, Subrahmanyam Murala:
USWformer: Efficient Sparse Wavelet Transformer for Underwater Image Enhancement. 3372-3382 - Arturo Miguel Russell Bernal, Jane Cleland-Huang, Walter J. Scheirer:
Psych-Occlusion: Using Visual Psychophysics for Aerial Detection of Occluded Persons During Search and Rescue. 3383-3395 - Yi Yang, Lei Zhong, Huiping Zhuang:
ReFu: Recursive Fusion for Exemplar-Free 3D Class-Incremental Learning. 3396-3405 - Juheon Son, Jang-Hwan Choi:
FMD: Comprehensive Data Compression in Medical Domain via Fused Matching Distillation. 3406-3415 - Rouqaiah Al-Refai, Philipp Hempel, Clara Biagi, Philipp Terhörst:
FALCON: Fair Face Recognition via Local Optimal Feature Normalization. 3416-3426 - Minh-Quan Le, Minh-Triet Tran, Trung-Nghia Le, Tam V. Nguyen, Thanh-Toan Do:
CamoFA: A Learnable Fourier-Based Augmentation for Camouflage Segmentation. 3427-3436 - Gianluca D'Amico, Federico Nesti, Giulio Rossolini, Mauro Marinoni, Salvatore Sabina, Giorgio C. Buttazzo:
SynDRA: Synthetic Dataset for Railway Applications. 3437-3446 - Abdul Mohaimen Al Radi, Prothito Shovon Majumder, Md. Mosaddek Khan:
Blind Image Deblurring with FFT-ReLU Sparsity Prior. 3447-3456 - Benjamin Coupry, Baptiste Brument, Antoine Laurent, Jean Mélou, Yvain Quéau, Jean-Denis Durou:
Assessing the Quality of 3D Reconstruction in the Absence of Ground Truth: Application to a Multimodal Archaeological Dataset. 3457-3466 - Gereziher Adhane, Mohammad Mahdi Dehshibi, Dennis Vetter, David Masip, Gemma Roig:
On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process. 3467-3476 - Sebastian Janampa, Marios Pattichis:
DT-LSD: Deformable Transformer-Based Line Segment Detection. 3477-3486 - Marzieh Mohammadi, Amir Salarpour:
Point-GN: A Non-Parametric Network Using Gaussian Positional Encoding for Point Cloud Classification. 3487-3496 - Rohan Chacko, Nicolai Häni, Eldar Khaliullin, Lin Sun, Douglas Lee:
Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation. 3497-3507 - Victor Rong, Jingxiang Chen, Sherwin Bahmani, Kiriakos N. Kutulakos, David B. Lindell:
GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling. 3508-3518 - Silvan Weder, Francis Engelmann, Johannes L. Schönberger, Akihito Seki, Marc Pollefeys, Martin R. Oswald:
ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic Reconstruction. 3519-3528 - Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chen:
DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models. 3529-3538 - Pengxiang Li, Kai Chen, Zhili Liu, Ruiyuan Gao, Lanqing Hong, Dit-Yan Yeung, Huchuan Lu, Xu Jia:
TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models. 3539-3548 - Yuan Zhang, Yutong Xie, Hu Wang, Jodie C. Avery, M. Louise Hull, Gustavo Carneiro
:
A Novel Perspective for Multi-Modal Multi-Label Skin Lesion Classification. 3549-3558 - Youngjun Jun, Jiwoo Park, Kyobin Choo, Tae Eun Choi, Seong Jae Hwang:
Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models. 3559-3569 - Chengyin Li, Rafi Ibn Sultan, Prashant Khanduri, Yao Qiang, Chetty J. Indrin, Dongxiao Zhu:
AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation. 3570-3580 - Chengyin Li, Hui Zhu, Rafi Ibn Sultan, Hassan Bagher-Ebadian, Prashant Khanduri, Chetty J. Indrin, Kundan Thind, Dongxiao Zhu:
MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training. 3581-3591 - Boqi Chen, Yuanzhi Zhu, Yunke Ao, Sebastiano Caprara, Reto Sutter, Gunnar Rätsch, Ender Konukoglu, Anna Susmelj:
Generalizable Single-Source Cross-Modality Medical Image Segmentation via Invariant Causal Mechanisms. 3592-3602 - Nikolas Adaloglou, Tim Kaiser, Felix Michels, Markus Kollmann:
Rethinking Cluster-Conditioned Diffusion Models for Label-Free Image Synthesis. 3603-3613 - Shwetha Ram, Tal Neiman, Qianli Feng, Andrew Stuart, Son Tran, Trishul Chilimbi:
DreamBlend: Advancing Personalized Fine-Tuning of Text-to-Image Diffusion Models. 3614-3623 - Delin An, Pengfei Gu, Milan Sonka, Chaoli Wang, Danny Z. Chen:
Sli2Vol+: Segmenting 3D Medical Images Based on an Object Estimation Guided Correspondence Flow Network. 3624-3634 - Jonghun Kim
, Inye Na, Eun Sook Ko, Hyunjin Park
:
Tumor Synthesis Conditioned on Radiomics. 3635-3646 - Nahid Ul Islam, Dongao Ma
, Jiaxuan Pang, Shivasakthi Senthil Velan, Michael B. Gotway, Jianming Liang:
Foundation X: Integrating Classification, Localization, and Segmentation Through Lock-Release Pretraining Strategy for Chest X-Ray Analysis. 3647-3656 - Youyuan Zhang, Xuan Ju, James J. Clark:
FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing. 3657-3666 - Sharon Chokuwa, Muhammad Haris Khan:
Divergent Domains, Convergent Grading: Enhancing Generalization in Diabetic Retinopathy Grading. 3667-3677 - Zhi Xu, Shaozhe Hao, Kai Han:
CusConcept: Customized Visual Concept Decomposition with Diffusion Models. 3678-3687 - Benito Buchheim, Max Reimann, Jürgen Döllner:
Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation. 3688-3697 - Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang:
Fine-grained Controllable Video Generation via Object Appearance and Context. 3698-3708 - Hsin-Ping Huang, Yu-Chuan Su, Ming-Hsuan Yang:
Generating Long-Take Videos via Effective Keyframes and Guidance. 3709-3720 - Rishubh Parihar, Prasanna Balaji, Raghav Magazine, Sarthak Vora, Varun Jampani, R. Venkatesh Babu:
Attribute Diffusion: Diffusion Driven Diverse Attribute Editing. 3721-3731 - Ming Kang, Fung Fung Ting, Raphaël C.-W. Phan, Chee-Ming Ting:
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices. 3732-3741 - Taewoo Kim, Geonsu Lee, Hyukgi Lee, Seongtae Kim, Younggun Lee:
PixSwap: High-Resolution Face Swapping for Effective Reflection of Identity via Pixel-Level Supervision with Synthetic Paired Dataset. 3742-3751 - Niklas Babendererde, Haozhe Zhu, Moritz Fuchs, Jonathan Stieber, Anirban Mukhopadhyay:
Federated-Continual Dynamic Segmentation of Histopathology Guided by Barlow Continuity. 3752-3761 - Yannik Frisch
, Christina Bornberg, Moritz Fuchs, Anirban Mukhopadhyay:
GAUDA: Generative Adaptive Uncertainty-Guided Diffusion-Based Augmentation for Surgical Segmentation. 3762-3771 - Zhongpai Gao, Abhishek Sharma, Meng Zheng, Benjamin Planche, Terrence Chen, Ziyan Wu:
Automated Patient Positioning with Learned 3D Hand Gestures. 3772-3781 - Xingzhe He, Zhiwen Cao, Nicholas I. Kolkin, Lantao Yu, Kun Wan, Helge Rhodin, Ratheesh Kalarot:
A Data Perspective on Enhanced Identity Preservation for Diffusion Personalization. 3782-3791 - Kangfu Mei, Nithin Gopalakrishnan Nair, Vishal M. Patel:
Improving Conditional Diffusion Models through Re-Noising from Unconditional Diffusion Priors. 3792-3801 - Mario Wieser, Daniel Siegismund, Stephan Steigele:
Revisiting Deep Archetypal Analysis for Phenotype Discovery in High Content Imaging. 3802-3811 - Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Jiale Cao, Zhong Ji, Mingming Sun:
SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior. 3812-3822 - Ziyu Zhou, Haozhe Luo, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Xiaowei Ding, Michael B. Gotway, Jianming Liang:
ACE: Anatomically Consistent Embeddings in Composition and Decomposition. 3823-3833 - Amin Ranem, John Kalkhof
, Anirban Mukhopadhyay:
NCAdapt: Dynamic Adaptation with Domain-Specific Neural Cellular Automata for Continual Hippocampus Segmentation. 3834-3843 - Michele De Vita, Vasileios Belagiannis:
Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation. 3844-3854 - Ruyu Wang, Xuefeng Hou, Sabrina Schmedding, Marco F. Huber:
STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation. 3855-3865 - Abdullah Al Rahat, Hemanth Venkateswara:
Dataset Augmentation by Mixing Visual Concepts. 3866-3875 - Chentianye Xu, Xueying Zhan, Min Xu:
CryoMAE: Few-Shot Cryo-EM Particle Picking with Masked Autoencoders. 3876-3885 - Roberto Di Via, Francesca Odone, Vito Paolo Pastore:
Self-Supervised Pre-Training with Diffusion Model for Few-Shot Landmark Detection in X-Ray Images. 3886-3896 - Ziyang Zheng, Ruiyuan Gao, Qiang Xu:
Non-Cross Diffusion for Semantic Consistency. 3897-3906 - Aiman Farooq, Deepak Mishra, Santanu Chaudhury:
Survival Prediction in Lung Cancer through Multi-Modal Representation Learning. 3907-3915 - Zakaria Patel, Kirill Serkh:
Enhancing Image Layout Control with Loss-Guided Diffusion Models. 3916-3924 - Zhenyue Qin, Yiqun Zhang, Yang Liu, Dylan Campbell:
HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images. 3925-3933 - Xin Jiang, Junwei Zheng, Ruiping Liu, Jiahang Li, Jiaming Zhang, Sven Matthiesen, Rainer Stiefelhagen:
@BENCH: Benchmarking Vision-Language Models for Human-centered Assistive Technology. 3934-3943 - Haoning Wu
, Shaocheng Shen, Qiang Hu, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang:
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning. 3944-3953 - Abhishek Kumar Sinha, S. Manthira Moorthi:
CharDiff: Improving Sampling Convergence via Characteristic Function Consistency in Diffusion Models. 3955-3964 - Anuja Vats, Ivar Farup, Marius Pedersen, Kiran B. Raja:
Uncertainty-Aware Regularization for Image-to-Image Translation. 3965-3974 - Hongsuk Choi, Isaac Kasahara, Selim Engin, Moritz A. Graule, Nikhil Chavan Dafle, Volkan Isler:
FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection. 3975-3984 - Souhaib Attaiki, Paul Guerrero, Duygu Ceylan, Niloy J. Mitra, Maks Ovsjanikov:
GANFusion: Feed-Forward Text-to-3D with Diffusion in GAN Space. 3985-3995 - Lucas N. Kirsten, Angelo Angonezi, Jose Marques, Fernanda Oliveira, Juliano Faccioni, Camila Cassel, Débora Santos de Sousa, Samlai Vedovatto, Guido Lenz, Cláudio R. Jung:
Oriented Cell Dataset: A Dataset and Benchmark for Oriented Cell Detection and Applications. 3996-4005 - Jinlin Xiang, Hillol Sarker, Bozhao Qi, Ruisu Zhang, Roger Trullo, Salvatore Badalamenti, Maria Wiekowski, Annie Kruger, Etienne Pochet, Qi Tang, Wei Zhao:
Endoscopic Scoring and Localization in Unconstrained Clinical Trial Videos. 4006-4015 - Vamsi Krishna Vasa, Peijie Qiu, Wenhui Zhu, Yujian Xiong, Oana M. Dumitrascu, Yalin Wang:
Context-Aware Optimal Transport Learning for Retinal Fundus Image Enhancement. 4016-4025 - Libing Zeng, Nima Khademi Kalantari:
Analyzing and Improving the Skin Tone Consistency and Bias in Implicit 3D Relightable Face Generators. 4026-4035 - Sheng Zhang, Jinge Wu, Junzhi Ning, Guang Yang:
DMRN: A Dynamical Multi-Order Response Network for the Robust Lung Airway Segmentation. 4036-4045 - Shahzad Ahmad, Sania Bano, Sukalpa Chanda, Santosh Kumar Vipparthi, Subrahmanyam Murala:
TRUST: Time-Domain Residual Unsupervised Stability Technique for Improved Heart Rate Estimation. 4046-4055 - Yoni Gozlan, Antoine Falisse, Scott D. Uhlrich, Anthony A. Gatti, Michael Black, Jennifer L. Hicks
, Scott L. Delp
, Akshay Chaudhari
:
OpenCapBench: A Benchmark to Bridge Pose Estimation and Biomechanics. 4056-4065 - Jianyi Zhang, Hao Yang, Ang Li, Xin Guo, Pu Wang, Haiming Wang, Yiran Chen, Hai Li:
MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning. 4066-4076 - Danfeng Guo, Sanchit Agarwal, Yu-Hsiang Lin, Jiun-Yu Kao, Tagyoung Chung, Nanyun Peng, Mohit Bansal:
Improving Faithfulness of Text-to-Image Diffusion Models through Inference Intervention. 4077-4086 - Idan Kligvasser, Regev Cohen, George Leifman, Ehud Rivlin, Michael Elad:
Anchored Diffusion for Video Face Reenactment. 4087-4097 - Youssof Nawar, Nouran Soliman, Moustafa Wassel, Mohamed ElHabebe, Noha Adly, Marwan Torki, Ahmed Elmassry, Islam Ahmed:
DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining. 4098-4107 - Zoltán Ádám Milacski, Koichiro Niinuma, Ryosuke Kawamura, Fernando De la Torre, László A. Jeni:
GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and-Text Contexts. 4108-4118 - Naveen Karunanayake, Suranga Seneviratne, Sanjay Chawla:
CRAFT: Class Ranking Aware Fine-Tuning for Enhanced Out-of-Distribution Detection. 4119-4128 - Fatemeh Haghighi, Michael B. Gotway, Jianming Liang:
Learning Anatomy-Disease Entangled Representation. 4129-4141 - Yilmaz Korkmaz, Vishal M. Patel:
MambaRecon: MRI Reconstruction with Structured State Space Models. 4142-4152 - Sai Bharath Chandra Gutha, Ricardo Vinuesa, Hossein Azizpour:
Inverse Problems with Diffusion Models: A MAP Estimation Perspective. 4153-4162 - Steven Hogue, Chenxu Zhang, Yapeng Tian, Xiaohu Guo:
Joint Co-Speech Gesture and Expressive Talking Face Generation Using Diffusion with Adapters. 4163-4172 - Fazle Rahat, M. Shifat Hossain, Md Rubel Ahmed, Sumit Kumar Jha, Rickard Ewetz:
Data Augmentation for Image Classification Using Generative AI. 4173-4182 - Qianwen Lu, Xingchao Yang, Takafumi Taketomi:
BeautyBank: Encoding Facial Makeup in Latent Space. 4183-4193 - Trung Dinh Quoc Dang, Huy Hoang Nguyen, Aleksei Tiulpin:
Image-Level Regression for Uncertainty-Aware Retinal Image Segmentation. 4194-4204 - Remi Chierchia, Léo Lebrat
, David Ahmedt-Aristizabal, Olivier Salvado
, Clinton Fookes, Rodrigo Santa Cruz:
SALVE: A 3D Reconstruction Benchmark of Wounds from Consumer-Grade Videos. 4205-4214 - Haeil Lee, Hansang Lee, Seoyeon Gye, Junmo Kim:
Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models Using Stepwise Spectral Analysis. 4215-4224 - Chun-Hong Cheng, Jing Wei Chin, Kwan Long Wong, Tsz Tai Chan, Hau Ching Lo, Kwan Lok Pang, Richard Hau Yue So, Bryan Yan:
Remote Blood Pressure Estimation from Facial Videos Using Transfer Learning: Leveraging PPG to rPPG Conversion. 4225-4236 - Ali Karami, Thi Kieu Khanh Ho, Narges Armanfard:
Graph-Jigsaw Conditioned Diffusion Model for Skeleton-Based Video Anomaly Detection. 4237-4247 - Tawsifur Rahman, Alexander S. Baras, Rama Chellappa:
CEMIL: Contextual Attention Based Efficient Weakly Supervised Approach for Histopathology Image Classification. 4248-4257 - Rasel Ahmed Bhuiyan, Adam Czajka:
Forensic Iris Image-Based Post-Mortem Interval Estimation. 4258-4267 - Sabina Martyniak, Joanna Kaleta, Diego Dall'Alba, Michal Naskret, Szymon Plotka, Przemyslaw Korzeniowski:
SimuScope: Realistic Endoscopic Synthetic Dataset Generation Through Surgical Simulation and Diffusion Models. 4268-4278 - Tonmoy Hossain, Jing Ma, Jundong Li, Miaomiao Zhang:
Invariant Shape Representation Learning for Image Classification. 4279-4289 - Kaito Shiku, Kazuya Nishimura, Daiki Suehiro, Kiyohito Tanaka, Ryoma Bise:
Ordinal Multiple-instance Learning for Ulcerative Colitis Severity Estimation with Selective Aggregated Transformer. 4290-4299 - Koushik Biswas, Amit Reza, Meghana Karri, Debesh Jha, Hongyi Pan, Nikhil Kumar Tomar, Aliza Subedi, Smriti Regmi, Ulas Bagci:
Optimizing Neural Network Effectiveness via Non-monotonicity Refinement. 4300-4309 - Justin Theiss, Norman Müller, Daeil Kim, Aayush Prakash:
Multi-View Image Diffusion via Coordinate Noise and Fourier Attention. 4310-4319 - Pamela Osuna-Vargas, Maren H. Wehrheim, Lucas Zinz, Johanna V. Rahm, Ashwin Balakrishnan, Alexandra Kaminer, Mike Heilemann, Matthias Kaschube:
Denoising Diffusion Models for High-Resolution Microscopy Image Restoration. 4320-4330 - Utkarsh Nath, Rajeev Goel, Eun Som Jeon, Changhoon Kim, Kyle Min, Yezhou Yang, Yingzhen Yang, Pavan K. Turaga:
Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation. 4331-4341 - Juhyung Ha, Jong Sung Park, David Crandall
, Eleftherios Garyfallidis, Xuhong Zhang:
Multi-Resolution Guided 3D GANs for Medical Image Translation. 4342-4351 - Muhammad Sohaib
, Siyavash Shabani, Sahar A. Mohammed, Garrett Winkelmaier, Bahram Parvin:
Multi-Aperture Transformers for 3D (MAT3D) Segmentation of Clinical and Microscopic Images. 4352-4361 - Joy Dhar, Nayyar Zaidi, Maryam Haghighat
, Sudipta Roy, Puneet Goyal, Azadeh Alavi, Vikas Kumar:
Multimodal Fusion Learning with Dual Attention for Medical Imaging. 4362-4371 - Sanyam Lakhanpal, Shivang Chopra, Vinija Jain, Aman Chadha, Man Luo:
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation. 4372-4381 - Man Minh Ho, Shikha Dubey, Yosep Chong, Beatrice Knudsen, Tolga Tasdizen:
F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation. 4382-4391 - Vaibhav Ganatra, Siddhartha Gairola, Pallavi Joshi, Anand Balasubramaniam, Kaushik Murali, Arivunithi Varadharajan, Bellamkonda Mallikarjuna, Nipun Kwatra, Mohit Jain:
SmartKC++: Improving Performance of Smartphone-Based Corneal Topographers. 4392-4399 - Kai Wang, Fei Yang, Bogdan Raducanu, Joost van de Weijer:
Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier. 4400-4409 - Antoine P. Sanner, Jonathan Stieber, Nils F. Grauhan, Suam Kim, Marc A. Brockmann, Ahmed E. Othman, Anirban Mukhopadhyay:
Federated Voxel Scene Graph for Intracranial Hemorrhage. 4410-4419 - Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen:
Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing. 4420-4429 - Pengfei Guo, Can Zhao, Dong Yang, Ziyue Xu, Vishwesh Nath, Yucheng Tang, Benjamin Simon, Mason Belue, Stephanie A. Harmon, Baris Turkbey, Daguang Xu:
MAISI: Medical AI for Synthetic Imaging. 4430-4441 - Sebastian Thiele, Jacqueline Kockwelp, Joachim Wistuba, Sabine Kliesch, Jörg Gromoll, Benjamin Risse:
Investigating Imaging, Annotation and Self-Supervision for the Classification of Continuously Developing Cells in Histological Whole Slide Images. 4442-4451 - Qiwen Deng, Yangcen Liu:
Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network. 4452-4461 - Xiaoyang Wei, Camille Kurtz, Florence Cloppet
:
Relaxing Binary Constraints in Contrastive Vision-Language Medical Representation Learning. 4462-4471 - Hyunsoo Lee, Minsoo Kang, Bohyung Han:
Diffusion-Based Conditional Image Editing Through Optimized Inference with Guidance. 4472-4480 - Ciprian A. Corneanu, Qianli Feng, Aleix M. Martínez:
Structured Human Assessment of Text-to-Image Generative Models. 4481-4490 - Raman Dutt, Ondrej Bohdal, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy M. Hospedales:
MemControl: Mitigating Memorization in Diffusion Models via Automated Parameter Selection. 4491-4501 - Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu, Peijie Qiu, Xiwen Chen, Yi Su, Yujian Xiong, Zhangsihao Yang, Yanxi Chen, Yalin Wang:
CUNSB-RFIE: Context-Aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement. 4502-4511 - Shuhan Xiao
, Lukas Klein, Jens Petersen, Philipp Vollmuth, Paul F. Jaeger, Klaus H. Maier-Hein:
Enhancing Predictive Imaging Biomarker Discovery Through Treatment Effect Analysis. 4512-4522 - Yan Zeng, Masanori Suganuma, Takayuki Okatani:
Inverting the Generation Process of Denoising Diffusion Implicit Models: Empirical Evaluation and a Novel Method. 4516-4524 - Chaewon Kim, Seung Jun Moon, Gyeong-Moon Park:
WINE: Wavelet-Guided GAN Inversion and Editing for High-Fidelity Refinement. 4523-4532 - Mingyu Sheng, Jianan Fan, Dongnan Liu, Ron Kikinis, Weidong Cai:
AMNCutter: Affinity-Attention-Guided Multi-View Normalized Cutter for Unsupervised Surgical Instrument Segmentation. 4533-4544 - Kenta Horikawa, Mariko Isogawa, Hideo Saito, Shohei Mori
:
Dense Depth from Event Focal Stack. 4545-4553 - Xulin Fan, Heting Gao, Ziyi Chen
, Peng Chang, Mei Han, Mark Hasegawa-Johnson:
SyncDiff: Diffusion-Based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization. 4554-4563 - Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla:
Morag - Multi-Fusion Retrieval Augmented Generation for Human Motion. 4564-4573 - Shahzad Ahmad, Sania Bano, Sachin Verma, Yogesh Singh Rawat, Sukalpa Chanda, Santosh Kumar Vipparthi, Subrahmanyam Murala:
PULSE: Physiological Understanding with Liquid Signal Extraction. 4574-4584 - Xindi Wu, Uriel Singer, Zhaojiang Lin, Andrea Madotto, Xide Xia, Yifan Xu, Paul A. Crook, Xin Luna Dong, Seungwhan Moon:
Corgi: Cached Memory Guided Video Generation. 4585-4594 - Sungkyu Yang
, Woohyun Park, Kwangil Yim, Mansu Kim:
MFTrans: A Multi-Resolution Fusion Transformer for Robust Tumor Segmentation in Whole Slide Images. 4595-4605 - Zhenyuan Dong, Sai Qian Zhang:
DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing. 4606-4615 - Zhuoyi Yang, Liyue Shen:
TempA-VLP: Temporal-Aware Vision-Language Pretraining for Longitudinal Exploration in Chest X-Ray Image. 4625-4634 - Fang-Yi Su, Tzu-Hung Chang, Jung-Hsien Chiang:
DiffuCE: Expert-Level CBCT Image Enhancement Using a Novel Conditional Denoising Diffusion Model with Latent Alignment. 4635-4644 - Vasco Ramos
, Yonatan Bitton, Michal Yarom, Idan Szpektor, João Magalhães:
Contrastive Sequential-Diffusion Learning: Non-Linear and Multi-Scene Instructional Video Synthesis. 4645-4654 - Tapas Kumar Dutta, Snehashis Majhi, Deepak Ranjan Nayak, Debesh Jha:
SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation. 4655-4664 - Xinxi Zhang, Song Wen, Ligong Han, Felix Juefei-Xu, Akash Srivastava, Junzhou Huang, Vladimir Pavlovic, Hao Wang, Molei Tao, Dimitris N. Metaxas:
SODA: Spectral Orthogonal Decomposition Adaptation for Diffusion Models. 4665-4682 - Gurucharan Marthi Krishna Kumar, Janine D. Mendola, Amir Shmuel:
Nestedmorph: Enhancing Deformable Medical Image Registration With Nested Attention Mechanisms. 4683-4692 - Yaoxin Zhuo, Zachary Bessinger, Lichen Wang, Naji Khosravan, Baoxin Li, Sing Bing Kang:
TFM2: Training-Free Mask Matching for Open-Vocabulary Semantic Segmentation. 4693-4703 - Marvin Burges, Sebastian Zambanini, Robert Sablatnig:
Interactive Object Detection for Tiny Objects in Large Remotely Sensed Images. 4704-4713 - Jingchen Sun, Rohan Sharma, Vishnu Suresh Lokhande, Changyou Chen:
Cross-Modal Feature Alignment and MMD Improve Robustness of Prompt Tuning. 4714-4724 - Yicheng Wang, Zhikang Zhang, Jue Wang, David Fan, Zhenlin Xu, Linda Liu, Xiang Hao, Vimal Bhat, Xinyu Li:
GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning. 4725-4735 - Sanggeon Yun, Ryozo Masukawa, Minhyoung Na, Mohsen Imani:
Missiongnn: Hierarchical Multimodal GNN-Based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation. 4736-4745 - Wendi Yang, Zihang Jiang, Shang Zhao, S. Kevin Zhou:
PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery. 4746-4756 - Ayush Gupta, Rama Chellappa:
MimicGait: A Model Agnostic approach for Occluded Gait Recognition Using Correlational Knowledge Distillation. 4757-4766 - Ekin Celikkan, Timo Kunzmann, Yertay Yeskaliyev, Sibylle Itzerott, Nadja Klein, Martin Herold
:
WeedsGalore: A Multispectral and Multitemporal UAV-Based Dataset for Crop and Weed Segmentation in Agricultural Maize Fields. 4767-4777 - Percy Lam
, Sooyong Park, Weiwei Chen
, Lavindra de Silva, Ioannis K. Brilakis:
CRAAC: Consistency Regularised Active Learning with Automatic Corrections for Real-Life Road Image Annotations. 4778-4787 - Sina Malakouti, Aysan Aghazadeh, Ashmit Khandelwal, Adriana Kovashka:
Benchmarking VLMs' Reasoning About Persuasive Atypical Images. 4788-4798 - I-Ting Tsai, Bharath Hariharan:
3D Synthesis for Architectural Design. 4799-4809 - Yan Yang, Utpal Bose, James Broadbent, Sally Stockwell, Keren Byrne, Md. Zakir Hossain, Eric A. Stone, Shannon Dillon:
Flowering Time Prediction of Wheat From DIA-MS Data. 4810-4820 - Xingjian Diao, Ming Cheng, Wayner Barrios, SouYoung Jin:
FT2TF: First-Person Statement Text-to-Talking Face Generation. 4821-4830 - Mayssa Zaier, Hazem Wannous, Hassen Drira:
Geometry-Aware Deep Learning for 3D Skeleton-Based Motion Prediction. 4831-4840 - Sanjana Sinha, Brojeshwar Bhowmick, Lokender Tiwari, Sushovan Chanda:
DisFlowEm : One-Shot Emotional Talking Head Generation Using Disentangled Pose and Expression Flow-Guidance. 4841-4851 - Sombit Dey, Ozan Unal, Christos Sakaridis, Luc Van Gool:
Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding. 4852-4861 - Xiaoyu Xiang, Liat Sless Gorelik, Yuchen Fan, Omri Armstrong, Forrest N. Iandola, Yilei Li, Ita Lifshitz, Rakesh Ranjan:
Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds. 4872-4881 - Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan:
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models. 4882-4890 - Ce Zheng, Xianpeng Liu, Qucheng Peng, Tianfu Wu
, Pu Wang
, Chen Chen:
DiffMesh: A Motion-Aware Diffusion Framework for Human Mesh Recovery from Videos. 4891-4901 - Bardia Safaei, Vishal M. Patel:
Active Learning for Vision-Language Models. 4902-4912 - Yoshitomo Matsubara, Matteo Mendula, Marco Levorato:
A Multi-Task Supervised Compression Model for Split Computing. 4913-4922 - Aashish Rai, Srinath Sridhar:
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos. 4935-4946 - Risako Tanigawa, Kenji Ishikawa, Noboru Harada, Yasuhiro Oikawa:
SoundSil-DS: Deep Denoising and Segmentation of Sound-field Images with Silhouettes. 4947-4956 - Bingqing Zhang, Zhuo Cao, Heming Du, Xin Yu
, Xue Li
, Jiajun Liu, Sen Wang
:
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm. 4957-4967 - Vittorio Pipoli, Federico Bolelli, Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Costantino Grana, Rita Cucchiara, Elisa Ficarra:
Semantically Conditioned Prompts for Visual Recognition Under Missing Modality Scenarios. 4968-4977 - Shubham Agarwal, Raz Birman, Ofer Hadar:
WARLearn: Weather-Adaptive Representation Learning. 4978-4987 - Hai Nguyen-Truong, E-Ro Nguyen, Tuan-Anh Vu
, Minh-Triet Tran, Binh-Son Hua, Sai-Kit Yeung:
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding. 4988-4998 - Tetsushi Yamada, Simone Di Santo:
Partial Texture VAE: Color and Texture Encoder for Rock Particle Images. 4999-5008 - Pramook Khungurn:
Talking Head Anime 4: Distillation for Real-Time Performance. 5018-5029 - Anh-Quan Cao, Maximilian Jaritz, Matthieu Guillaumin, Raoul de Charette, Loris Bazzani
:
LATTECLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts. 5030-5040 - Denys Rozumnyi, Nadine Bertsch, Othman Sbai, Filippo Arcadu, Yuhua Chen, Artsiom Sanakoyeu, Manoj Kumar, Catherine Herold, Robin Kips:
XR-MBT: Multi-Modal Full Body Tracking for XR Through Self-Supervision with Learned Depth Point Cloud Registration. 5041-5050 - Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis:
Similarity Over Factuality: Are we Making Progress on Multimodal Out-of-Context Misinformation Detection? 5041-5050 - Yanan Niu
, Roy Sarkis, Demetri Psaltis, Mario Paolone, Christophe Moser, Luisa Lambertini:
Solar Multimodal Transformer: Intraday Solar Irradiance Predictor Using Public Cameras and Time Series. 5051-5060 - Sina Hajimiri, Ismail Ben Ayed, Jose Dolz:
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation. 5061-5071 - Xin Ye, Feng Tao, Abhirup Mallik, Burhaneddin Yaman, Liu Ren:
LORD: Large Models Based Opposite Reward Design for Autonomous Driving. 5072-5081 - Md Mahedi Hasan, Shoaib Meraj Sami, Nasser M. Nasrabadi:
CLFace: A Scalable and Resource-Efficient Continual Learning Framework for Lifelong Face Recognition. 5082-5091 - Dingkun Yan, Liang Yuan, Erwin Wu, Yuma Nishioka, Issei Fujishiro, Suguru Saito:
ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model. 5092-5102 - Lior Dikstein, Ariel Lapid, Arnon Netzer, Hai Victor Habi:
Data Generation for Hardware-Friendly Post-Training Quantization. 5103-5113 - Bo Lang, Mooi Choo Chuah:
Event-Guided Video Transformer for End-to-End 3D Human Pose Estimation. 5114-5124 - Wele Gedara Chaminda Bandara, Vishal M. Patel:
Deep Metric Learning for Unsupervised Remote Sensing Change Detection. 5125-5135 - Xuanchen Wang, Heng Wang, Dongnan Liu, Weidong Cai:
Dance any Beat: Blending Beats with Visuals in Dance Video Generation. 5136-5146 - Valentin Bieri, Marco Zamboni, Nicolas S. Blumer, Qingxuan Chen, Francis Engelmann:
OpenCity3D: What do Vision-Language Models Know About Urban Environments? 5147-5155 - Abid Ali, Rui Dai, Ashish Marisetty, Guillaume Astruc, Monique Thonnat, Jean-Marc Odobez, Susanne Thümmler, François Brémond:
Loose Social-Interaction Recognition in Real-World Therapy Scenarios. 5156-5165 - Julius Pesonen, Teemu Hakala, Väinö Karjalainen, Niko Koivumäki, Lauri Markelin, Anna-Maria Raita-Hakola
, Juha Suomalainen, Ilkka Pölönen
, Eija Honkavaara:
Detecting Wildfires on UAVs with Real-Time Segmentation Trained by Larger Teacher Models. 5166-5176 - Ying Shen, Daniel Bis, Cynthia Lu, Ismini Lourentzou:
ELBA: Learning by Asking for Embodied Visual Navigation and Task Completion. 5177-5186 - Tim Dieter Eberhardt, Tim Brühl, Robin Schwager, Tin Stribor Sohn, Wilhelm Stork:
Clarity Amidst Blur: A Deterministic Method for Synthetic Generation of Water Droplets on Camera Lenses. 5187-5196 - Siddharth Seth, Rishabh Dabral, Diogo C. Luvizon, Marc Habermann, Ming-Hsuan Yang, Christian Theobalt, Adam Kortylewski:
PocoLoco: A Point Cloud Diffusion Model of Human Shape in Loose Clothing. 5197-5206 - Hanyuan Xiao, Yingshu Chen, Huajian Huang, Haolin Xiong, Jing Yang, Pratusha Prasad, Yajie Zhao:
Localized Gaussian Splatting Editing with Contextual Awareness. 5207-5217 - Doyoung Park, Naresh Reddy Yarram, Sunjin Kim, Minkyu Kim, Seongho Cho, Taehee Lee:
Text Change Detection in Multilingual Documents Using Image Comparison. 5218-5227 - Zihao Zou, Jiaming Liu, Shirin Shoushtari, Yubo Wang, Ulugbek S. Kamilov:
FLAIR: A Conditional Diffusion Framework with Applications to Face Video Restoration. 5228-5238 - Wenjun Huang, Yang Ni, Arghavan Rezvani, Sungheon Jeong, Hanning Chen, Yezi Liu, Fei Wen, Mohsen Imani:
Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach. 5239-5249 - Wele Gedara Chaminda Bandara, Nithin Gopalakrishnan Nair, Vishal M. Patel:
DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Remote Sensing Change Detection. 5250-5262 - Yusuke Akamatsu, Terumi Umematsu, Hitoshi Imaoka, Shizuko Gomi, Hideo Tsurushima:
ComFace: Facial Representation Learning with Synthetic Data for Comparing Faces. 5263-5273 - Hankyeol Lee
, Gawon Seo, Wonseok Choi, Geunyoung Jung, Kyungwoo Song, Jiyoung Jung:
Enhancing Visual Classification Using Comparative Descriptors. 5274-5283 - Luca Collorone, Stefano D'Arrigo, Massimiliano Pappa, Guido Maria D'Amely di Melendugno, Giovanni Ficarra
, Fabio Galasso:
ANTHROPOS-V: Benchmarking the Novel Task of Crowd Volume Estimation. 5284-5294 - Raquel Panadero, Dominik Schörkhuber, Margrit Gelautz:
Importance-Guided Interpretability and Pruning for Video Transformers in Driver Action Recognition. 5295-5304 - Puneet Kumar, Shreshtha Misra, Zhuhong Shao, Bin Zhu, Balasubramanian Raman, Xiaobai Li:
Multimodal Interpretable Depression Analysis Using Visual, Physiological, Audio and Textual Data. 5305-5315 - Anudeep Vurity
, Emanuela Marasco, Raghavendra Ramachandra, Jongwoo Park:
ColFigPhotoAttnNet: Reliable Finger Photo Presentation Attack Detection Leveraging Window-Attention on Color Spaces. 5316-5325 - Zhao-Yang Wang, Jiang Liu, Jieneng Chen, Rama Chellappa:
VM-Gait: Multi-Modal 3D Representation Based on Virtual Marker for Gait Recognition. 5326-5335 - Kevin Flanagan, Dima Damen
, Michael Wray:
Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval. 5336-5345 - Hao Fu, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami:
CLIPScope: Enhancing Zero-Shot OOD Detection with Bayesian Scoring. 5346-5355 - Ahmad Arrabi, Xiaohan Zhang
, Waqas Sultani, Chen Chen, Safwan Wshah:
Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance. 5356-5366 - Shakeeb Murtaza, Soufiane Belharbi, Marco Pedersoli, Eric Granger:
A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. 5367-5376 - Mu Cai, Zeyi Huang, Yuheng Li, Utkarsh Ojha, Haohan Wang, Yong Jae Lee:
An Investigation on LLMs' Visual Understanding Ability Using SVG for Image-Text Bridging. 5377-5386 - Deepti Rawat, Keshav Gupta
, Aryamaan Basu Roy, Ravi Kiran Sarvadevabhatla:
DashCop: Automated E-Ticket Generation for Two-Wheeler Traffic Violations Using Dashcam Videos. 5387-5397 - Bumsoo Kim, Wonseop Shin
, Kyuchul Lee
, Yonghoon Jung, Sanghyun Seo:
Make VLM Recognize Visual Hallucination on Cartoon Character Image with Pose Information. 5398-5407 - Yuhang He, Sangyun Shin, Anoop Cherian, Niki Trigoni, Andrew Markham:
SoundLoc3D: Invisible 3D Sound Source Localization and Classification Using a Multimodal RGB-D Acoustic Camera. 5408-5418 - Hiroki Nishizawa, Keitaro Tanaka, Asuka Hirata, Shugo Yamaguchi, Qi Feng, Masatoshi Hamanaka, Shigeo Morishima:
SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering. 5419-5428 - Haoyu Jiang, Zhi-Qi Cheng
, Gabriel Moreira
, Jiawen Zhu, Jingdong Sun, Bukun Ren, Jun-Yan He, Qi Dai, Xian-Sheng Hua:
UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval. 5429-5438 - Jash Dalvi, Ali Dabouei, Gunjan Dhanuka, Min Xu:
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection. 5439-5448 - Raza Imam, Hanan Gani, Muhammad Huzaifa, Karthik Nandakumar:
Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models. 5449-5459 - Evelyn A. Stump, Francesco Luzi, Leslie M. Collins, Jordan M. Malof:
Meta-Learning for Color-to-Infrared Cross-Modal Style Transfer. 5460-5469 - Tevin Moodley, Dustin van der Haar:
I3D-AE-LSTM: A 2-Stream Autoencoder for Action Quality Assessment Using a Newly Created Cricket Batsman Video Dataset. 5470-5478 - Junno Yun, Mehmet Akçakaya
:
Generative Model-Based Fusion for Improved Few-Shot Semantic Segmentation of Infrared Images. 5479-5488 - Pinrui Yu, Zhenglun Kong, Pu Zhao, Peiyan Dong, Hao Tang, Fei Sun, Xue Lin, Yanzhi Wang:
Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation. 5489-5499 - Debolena Basak, Soham Bhatt, Sahith Kanduri, Maunendra Sankar Desarkar:
Aerial Mirage: Unmasking Hallucinations in Large Vision Language Models. 5500-5508 - Bhavin Jawade, João V. B. Soares, Kapil Thadani, Deen Dayal Mohan, Amir Erfan Eshratifar, Benjamin Culpepper, Paloma de Juan, Srirangaraj Setlur, Venu Govindaraju:
SCOT: Self-Supervised Contrastive Pretraining for Zero-Shot Compositional Retrieval. 5509-5519 - Dipu Manandhar, Paul Guerrero, Zhaowen Wang, John P. Collomosse:
CLASS: Conditional Latent Architecture for Search and Synthesis of Design Layouts. 5520-5529 - Seon-Ho Lee, Jue Wang, David Fan, Zhikang Zhang, Linda Liu, Xiang Hao, Vimal Bhat, Xinyu Li:
Now you see Me: Context-Aware Automatic Audio Description. 5530-5539 - Niharika Hegde
, Shishir Muralidhara, René Schuster, Didier Stricker:
Modality-Incremental Learning with Disjoint Relevance Mapping Networks for Image-Based Semantic Segmentation. 5540-5549 - Donggeun Kim, Yujin Jo, Myungjoo Lee, Taesup Kim:
Retaining and Enhancing Pre-trained Knowledge in Vision-Language Models with Prompt Ensembling. 5550-5559 - Junha Lee, Sojung An, Sujeong You, Nam Ik Cho:
Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation. 5560-5569 - Maksim Golyadkin, Ianis Plevokas, Ilya Makarov:
Closing the Domain Gap in Manga Colorization via Aligned Paired Dataset. 5580-5590 - Anurag Deo
, Savita Bhat, Shirish S. Karande:
VisualFusion: Enhancing Blog Content with Advanced Infographic Pipeline. 5591-5600 - Daniel Steininger, Julia Simon
, Andreas Trondl, Markus Murschitz
:
TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry Operations. 5601-5610 - Idris Zakariyya, Linda Tran, Kaushik Bhargav Sivangi, Paul Henderson, Fani Deligianni:
Differentially Private Integrated Decision Gradients (IDG-DP) for Radar-Based Human Activity Recognition. 5611-5622 - Suguru Onda, Ryan Farrell:
The FineView Dataset: A 3D Scanned Multi-View Object Dataset of Fine-Grained Category Instances. 5623-5634 - Deepayan Das, Davide Talon, Massimiliano Mancini, Yiming Wang
, Elisa Ricci:
One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering. 5635-5645 - Tom Gillooly, Jean-Baptiste Thomas, Jon Yngve Hardeberg, Giuseppe Claudio Guarnera:
Image Adaptation for Colour Vision Deficient Viewers Using Vision Transformers. 5646-5655 - Pallavi Jain, Dino Ienco, Roberto Interdonato, Tristan Berchoux, Diego Marcos
:
SenCLIP: Enhancing Zero-Shot Land-Use Mapping for Sentinel-2 with Ground-Level Prompting. 5656-5665 - Zhuowen Zou, Prathyush Poduval, Narayan Srinivasa, Mohsen Imani:
Hyperdimensional Representation for Adaptive Information Association and Memorization. 5666-5675 - Sahil Goyal, Abhinav Mahajan, Swasti Mishra, Prateksha Udhayanan, Tripti Shukla, K. J. Joseph, Balaji Vasan Srinivasan:
Design-O-Meter: Towards Evaluating and Refining Graphic Designs. 5676-5686 - Muhammad Awais, Ali Husain Salem Abdulla Alharthi, Amandeep Kumar, Hisham Cholakkal, Rao Muhammad Anwer:
AgroGPT : Efficient Agricultural Vision-Language Model with Expert Tuning. 5687-5696 - Zhuo Xu, Xiang Xiang:
Learning Visual-Semantic Hierarchical Attribute Space for Interpretable Open-Set Recognition. 5697-5706 - Harini S. I, Somesh Singh, Yaman Kumar Singla, Aanisha Bhattacharyya, Veeky Baths, Changyou Chen, Rajiv Ratn Shah, Balaji Krishnamurthy:
Long-Term Ad Memorability: Understanding & Generating Memorable Ads. 5707-5718 - Debasmita Pal, Redwan Sony, Arun Ross:
A Parametric Approach to Adversarial Augmentation for Cross-Domain Iris Presentation Attack Detection. 5719-5729 - Abhishek Rajora, Shubham Gupta, Suman Kundu:
Cross-Aligned Fusion For Multimodal Understanding. 5730-5740 - Hanwen Zheng, Sijia Wang, Chris Thomas, Lifu Huang:
Advancing Chart Question Answering with Robust Chart Component Recognition. 5741-5750 - Moyuru Yamada, Nimish Dharamshi, Ayushi Kohli, Prasad Kasu, Ainulla Khan, Manu Ghulyani:
Unleashing Potentials of Vision-Language Models for Zero-Shot HOI Detection. 5751-5760 - Zi-Xiang Xia, Sudeep Fadadu, Yi Shi, Louis Foucard:
Robust Long-Range Perception Against Sensor Misalignment in Autonomous Vehicles. 5761-5770 - Felix Hertlein, Alexander Naumann, York Sure-Vetter:
DocMatcher: Document Image Dewarping via Structural and Textual Line Matching. 5771-5780 - Dulanga Weerakoon, Vigneshwaran Subbaraju, Joo Hwee Lim, Archan Misra
:
NeuroViG - Integrating Event Cameras for Resource-Efficient Video Grounding. 5781-5790 - Haiyu Wu, Sicong Tian, Huayu Li, Kevin W. Bowyer:
LogicNet: A Logical Consistency Embedded Face Attribute Learning Network. 5791-5800 - Hasnat Md Abdullah, Tian Liu, Kangda Wei, Shu Kong, Ruihong Huang:
UAL-Bench: The First Comprehensive Unusual Activity Localization Benchmark. 5801-5811 - Neha Choudhary
, Poonam Goyal, Devashish Siwatch, Atharva Chandak, Harsh Mahajan, Varun Khurana, Yaman Kumar:
AdQuestA: Knowledge-Guided Visual Question Answer Framework for Advertisements. 5812-5821 - Raymond Yu, Paul Han, Piper Wolters, Favyen Bastani:
OPTIMUS: Observing Persistent Transformations in Multi-Temporal Unlabeled Satellite-Data. 5822-5830 - María Escobar, Juanita Puentes, Cristhian Forigua, Jordi Pont-Tuset, Kevis-Kokitsi Maninis, Pablo Arbeláez:
EgoCast: Forecasting Egocentric Human Pose in the Wild. 5831-5841 - Cheng-En Wu, Jinhong Lin, Yu Hen Hu, Pedro Morgado:
Patch Ranking: Token Pruning as Ranking Prediction for Efficient CLIP. 5842-5851 - Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt:
Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery. 5852-5862 - Samyak Rawlekar, Shubhang Bhatnagar, Narendra Ahuja:
PositiveCoOp: Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations. 5863-5872 - Alexander Ponomarchuk, Ivan Kruzhilov, Gleb Mazanov, Ruslan Utegenov, Artem Shadrin, Galina Zubkova, Ivan Bessonov, Pavel Blinov:
CardioSyntax: End-to-End SYNTAX Score Prediction - Dataset, Benchmark and Method. 5873-5883 - Ziqiang Dang, Jianfang Li, Lin Liu
:
Cascaded Dual Vision Transformer for Accurate Facial Landmark Detection. 5884-5894 - Charles Gaydon, Floryne Roche:
PureForest: A Large-Scale Aerial Lidar and Aerial Imagery Dataset for Tree Species Classification in Monospecific Forests. 5895-5904 - Ce Zhang, Simon Stepputtis, Katia P. Sycara, Yaqi Xie:
Enhancing Vision-Language Few-Shot Adaptation with Negative Learning. 5905-5915 - Jia-Wei Liao, Winston Wang, Tzu-Sian Wang, Li-Xuan Peng, Ju-Hsuan Weng, Cheng-Fu Chou, Jun-Cheng Chen:
DiffQRCoder: Diffusion-Based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement. 5916-5925 - Jingjiao Zhao, Jiaju Li, Dongze Lian, Liguo Sun, Pin Lv:
DualCIR: Enhancing Training-Free Composed Image Retrieval via Dual-Directional Descriptions. 5926-5936 - Ee Yeo Keat, Hao Zhang, Alexander Matyasko, Basura Fernando:
Deduce and Select Evidences with Language Models for Training-Free Video Goal Inference. 5937-5947 - Luca Scofano, Alessio Sampieri, Edoardo De Matteis, Indro Spinelli, Fabio Galasso:
Social EgoMesh Estimation. 5948-5958 - Xiang Huang, Zhi-Qi Cheng
, Jun-Yan He, Chenyang Li, Wangmeng Xiang, Baigui Sun:
DyRoNet: Dynamic Routing and Low-Rank Adapters for Autonomous Driving Streaming Perception. 5959-5968 - Siyuan Huang, Ram Prabhakar, Yuxiang Guo, Rama Chellappa, Cheng Peng:
VILLS: Video-Image Learning to Learn Semantics for Person Re-Identification. 5969-5979 - Sumin Hu, Youngmin Yoo, Jeeseong Kim, Changsoo Lim, Doohyun Cho, Bongnam Kang:
A Generic Vehicle-to-Sensor Calibration Framework. 5980-5989 - Christian Benz, Volker Rodehorst
:
Crackstructures and Crackensembles: The Power of Multi-View for 2.5D Crack Detection. 5990-5999 - Shuo Chen, Zhen Han, Bailan He, Jianzhe Liu, Mark Buckley, Yao Qin, Philip Torr, Volker Tresp, Jindong Gu:
Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning? 6000-6010 - Rupanjali Kukal, Jay Patravali, Fuxun Yu, Simranjit Singh, Nikolaos Karianakis, Rishi Madhok:
Click&Describe: Multimodal Grounding and Tracking for Aerial Objects. 6011-6021 - Wenzhao Qiu, Shanmin Pang, Hao Zhang, Jianwu Fang, Jianru Xue:
HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning. 6022-6031 - Jinnan Chen, Chen Li, Gim Hee Lee:
DiHuR: Diffusion-Guided Generalizable Human Reconstruction. 6032-6041 - Ashutosh Chaubey, Anoubhav Agrawal, Sartaki Sinha Roy, Aayush Agrawal, Susmita Ghose:
ContextIQ: A Multimodal Expert-Based Video Retrieval System for Contextual Advertising. 6042-6052 - Aishwarya Agarwal, Srikrishna Karanam, Tripti Shukla, Balaji Vasan Srinivasan:
An Image is Worth Multiple Words: Multi-Attribute Inversion for Constrained Text-To-Image Synthesis. 6053-6062 - Xinhao Zhou, Tong Wang, Zhaodong Liu, Hao Wei, Guangyuan Pan:
A Regional-Level Resource-Saving Model for Winter Road Surface Snow Detection in Extreme Weathers. 6063-6072 - Nicola Fanelli, Gennaro Vessio
, Giovanna Castellano:
I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting. 6073-6082 - Eman Ali, Sathira Silva, Muhammad Haris Khan:
DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models. 6083-6093 - Zijiao Yang, Xiangxi Shi, Eric Slyman, Stefan Lee:
Hijacking Vision-and-Language Navigation Agents with Adversarial Environmental Attacks. 6094-6103 - Ruoyu Wang, Yangfan He, Tengjiao Sun, Xiang Li, Tianyu Shi:
UniTMGE: Uniform Text-Motion Generation and Editing Model via Diffusion. 6104-6114 - Yehun Song, Sunyoung Cho:
Leveraging CLIP Encoder for Multimodal Emotion Recognition. 6115-6124 - Po-Hsuan Huang, Jeng-Lin Li, Chin-Po Chen, Ming-Ching Chang, Wei-Chao Chen:
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality Analysis. 6125-6135 - Daniel Panangian, Ksenia Bittner:
Can Location Embeddings Enhance Super-Resolution of Satellite Imagery? 6136-6145 - Dinghao Jin, Yuan Zeng, Yi Gong:
Bandwidth-Efficient Communication Modelling for Autonomous Vehicle Collaborative Perception. 6146-6155 - Mallika Garg, Debashis Ghosh, Pyari Mohan Pradhan:
ConvMixFormer- A Resource-Efficient Convolution Mixer for Transformer-Based Dynamic Hand Gesture Recognition. 6156-6166 - Mathieu Cocheteux
, Julien Moreau, Franck Davoine:
Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach. 6167-6176 - Floriane Magera, Thomas Hoyoux, Olivier Barnich, Marc Van Droogenbroeck:
BroadTrack: Broadcast Camera Tracking for Soccer. 6177-6187 - Hah Min Lew, Sahng-Min Yoo, Hyunwoo Kang, Gyeong-Moon Park:
Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications. 6188-6196 - Xiang Li, Yangfan He, Shuaishuai Zu, Zhengyang Li, Tianyu Shi, Yiting Xie, Kevin Zhang:
Multi-Modal Large Language Model with RAG Strategies in Soccer Commentary Generation. 6197-6206 - Niloufar Alipour Talemi, Hossein Kashiani, Fatemeh Afghah:
Style-Pro: Style-Guided Prompt Learning for Generalizable Vision-Language Models. 6207-6216 - Hung-Shuo Chang, Chien-Yao Wang, Richard Robert Wang, Gene Chou, Hong-Yuan Mark Liao:
Generalist YOLO: Towards Real-Time End-to-End Multi-Task Visual Language Models. 6217-6227 - Patrick Ramos, Nicolas Gonthier, Selina Khan, Yuta Nakashima, Noa Garcia:
No Annotations for Object Detection in Art Through Stable Diffusion. 6228-6237 - Manju R. A, Atul Kumar, Akshay Agarwal:
On Which Data Distribution (Synthetic or Real) We Should Rely for Soft Biometric Classification. 6238-6247 - Weixi Weng, Rui Zhang, Xiaojun Meng, Jieming Zhu, Qun Liu, Chun Yuan:
Unsupervised Domain Adaptive Visual Question Answering in the Era of Multi-Modal Large Language Models. 6248-6258 - Cole Hill, Florence Yellin, Krishna Regmi, Dawei Du, Scott McCloskey:
Re-identifying People in Video via Learned Temporal Attention and Multi-modal Foundation Models. 6259-6268 - Yao Zhang, Haokun Chen, Ahmed Frikha, Denis Krompass, Gengyuan Zhang, Jindong Gu, Volker Tresp:
CL-Cross VQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering. 6269-6278 - Abid Hasan Zim, Aquib Iqbal, Zaid Al-Huda, Asad Malik, Minoru Kuribayashi:
EfficientCrackNet: A Lightweight Model for Crack Segmentation. 6279-6289 - Shir Bar, Or Hirschorn, Roi Holzman, Shai Avidan:
Sifting Through the Haystack - Efficiently Finding Rare Animal Behaviors in Large-Scale Datasets. 6290-6299 - Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik:
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance. 6300-6310 - Simone Fobi Nsutezo, Amrita Gupta, Duncan Kebut, Seema Iyer, Luana Marotti, Rahul Dodhia, Juan M. Lavista Ferres, Anthony Ortiz:
PGRID: Power Grid Reconstruction in Informal Developments Using High-Resolution Aerial Imagery. 6311-6319 - Muhammad Arbab Arshad, Talukder Zaki Jubery, Tirtho Roy, Rim Nassiri, Asheesh K. Singh, Arti Singh, Chinmay Hegde, Baskar Ganapathysubramanian, Aditya Balu, Adarsh Krishnamurthy, Soumik Sarkar:
Leveraging Vision Language Models for Specialized Agricultural Tasks. 6320-6329 - Farnoosh Koleini, Muhammad Usama Saleem, Pu Wang
, Hongfei Xue
, Ahmed Helmy
, Abbey Fenwick:
BioPose: Biomechanically-Accurate 3D Pose Estimation from Monocular Videos. 6330-6339 - Aniket Bhattacharyya, Anurag Tripathi:
Information Extraction from Heterogeneous Documents Without Ground Truth Labels Using Synthetic Label Generation and Knowledge Distillation. 6351-6361 - Mingjie Xu, Mengyang Wu, Yuzhi Zhao
, Jason Chun Lok Li, Weifeng Ou:
LLaVA-SpaceSGG: Visual Instruct Tuning for Open-Vocabulary Scene Graph Generation with Enhanced Spatial Relations. 6362-6372 - Bishoy Galoaa, Somaieh Amraee, Sarah Ostadabbas:
DragonTrack: Transformer-Enhanced Graphical Multi-Person Tracking in Complex Scenarios. 6373-6382 - Qianying Liu, Paul Henderson, Xiao Gu, Hang Dai, Fani Deligianni:
Learning Semi-Supervised Medical Image Segmentation from Spatial Registration. 6383-6393 - Parinita Nema, Vinod K. Kurmi:
Strategic Base Representation Learning via Feature Augmentations for Few-Shot Class Incremental Learning. 6394-6403 - Bare Luka Zagar, Mingyu Liu, Tim Hertel, Ekim Yurtsever, Alois Knoll:
3D Understanding of Deformable Linear Objects: Datasets and Transferability Benchmark. 6404-6414 - Ryozo Masukwa, Sanggeon Yun, Yoshiki Yamaguchi, Mohsen Imani:
PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation. 6415-6424 - Pascal Schlachter, Simon Wagner, Bin Yang:
Memory-Efficient Pseudo-Labeling for Online Source-Free Universal Domain Adaptation using a Gaussian Mixture Model. 6425-6434 - Alessio Quercia, Erenus Yildiz, Zhuo Cao, Kai Krajsek, Abigail Morrison, Ira Assent
, Hanno Scharr:
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks. 6435-6445 - Shivangi Rai, Rini Smita Thakur, Kunal Jangid, Vinod K. Kurmi:
Label Calibration in Source Free Domain Adaptation. 6446-6455 - Alexey Kravets, Vinay P. Namboodiri:
Zero-Shot Class Unlearning in CLIP with Synthetic Samples. 6456-6464 - Cheng-Yi Lee
, Ching-Chia Kao, Cheng-Han Yeh, Chun-Shien Lu, Chia-Mu Yu, Chu-Song Chen:
Defending Against Repetitive Backdoor Attacks on Semi-Supervised Learning Through Lens of Rate-Distortion-Perception Trade-Off. 6465-6474 - Ahmet Serdar Karadeniz, Dimitrios Mallis, Nesryne Mejri, Kseniya Cherenkova, Anis Kacem, Djamila Aouada:
PICASSO: A Feed-Forward Framework for Parametric Inference of CAD Sketches via Rendering Self-Supervision. 6475-6484 - Favour Ekong, Jun Zhou, Kwabena Sarpong
, Yongsheng Gao:
Pixel-Wise Shuffling with Collaborative Sparsity for Melanoma Hyperspectral Image Classification. 6485-6494 - Chamuditha Jayanga Galappaththige, Zachary Izzo, Xilin He, Honglu Zhou, Muhammad Haris Khan:
Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization. 6495-6505 - Dario Serez, Marco Cristani, Alessio Del Bue, Vittorio Murino, Pietro Morerio:
Pre-trained Multiple Latent Variable Generative Models are Good Defenders Against Adversarial Attacks. 6506-6516 - Xiaoyu Liu, Beitong Zhou, Zuogong Yue, Cheng Cheng:
PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning. 6517-6527 - Thomas Westfechtel, Dexuan Zhang, Tatsuya Harada:
Combining Inherent Knowledge of Vision-Language Models with Unsupervised Domain Adaptation Through Strong-Weak Guidance. 6528-6537 - Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad, Alex M. Bronstein:
Class-Conditioned Transformation for Enhanced Robust Image Classification. 6538-6547 - Sameer Ambekar, Zehao Xiao, Xiantong Zhen, Cees G. M. Snoek:
GeneralizeFormer: Layer-Adaptive Model Generation Across Test-Time Distribution Shifts. 6548-6558 - Shutong Jin, Ruiyu Wang, Kuangyi Chen, Florian T. Pokorny:
PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement. 6559-6569 - Zeyu Shangguan, Daniel Seita, Mohammad Rostami:
Cross-Domain Multi-Modal Few-Shot Object Detection via Rich Text. 6570-6580 - Lei Zhu, Yanyu Xu, Yong Liu, Rick Siow Mong Goh, Xinxing Xu:
Ad2Mix: Adversarial and Adaptive Mixup for Unsupervised Domain Adaptation. 6581-6590 - Matthew Gwilliam, Michael Cogswell, Meng Ye, Karan Sikka, Abhinav Shrivastava, Ajay Divakaran:
A Video is Worth 10, 000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval. 6591-6601 - Jia Fu, Xiao Zhang, Sepideh Pashami, Fatemeh Rahimian, Anders Holst:
DiffPAD: Denoising Diffusion-Based Adversarial Patch Decontamination. 6602-6611 - Romain Hermary, Vincent Gaudillière, Abd El Rahman Shabayek, Djamila Aouada:
Removing Geometric Bias in One-Class Anomaly Detection with Adaptive Feature Perturbation. 6612-6622 - Md Farhan Ishmam, Ishmam Tashdeed, Talukder Asir Saadat, Md. Hamjajul Ashmafee, Abu Raihan Mostofa Kamal, Md. Azam Hossain:
Visual Robustness Benchmark for Visual Question Answering (VQA). 6623-6633 - Xiwen Wei, Guihong Li, Radu Marculescu:
Online-LoRA: Task-Free Online Continual Learning via Low Rank Adaptation. 6634-6645 - Benjamin Bauchwitz, Mary L. Cummings:
Task Configuration Impacts Annotation Quality and Model Training Performance in Crowdsourced Image Segmentation. 6646-6656 - Youcef Djenouri, Ahmed Nabil Belbachir, Asma Belhadi, Nassim Belmecheri, Tomasz P. Michalak:
Shapley Consensus Deep Learning for Ensemble Pruning. 6657-6666 - Jiuhong Xiao
, Gao Zhu, Giuseppe Loianno:
VG-SSL: Benchmarking Self-Supervised Representation Learning Approaches for Visual Geo-Localization. 6667-6677 - Chenyu Wang, Weixin Luo, Sixun Dong, Xiaohua Xuan, Zhengxin Li, Lin Ma, Shenghua Gao:
MLLM-Tool: A Multimodal Large Language Model for Tool Agent Learning. 6678-6687 - Nishq Poorav Desai, Ali Etemad, Michael A. Greenspan:
CycleCrash: A Dataset of Bicycle Collision Videos for Collision Prediction and Analysis. 6688-6698 - Marco Colussi
, Sergio Mascetti, Jose Dolz, Christian Desrosiers:
ReC- Ttt: Contrastive Feature Reconstruction for Test-Time Training. 6699-6708 - Manojna Sistla, Yu Wen, Aamir Bader Shah, Chenpei Huang, Lening Wang, Xuqing Wu, Jiefu Chen, Miao Pan, Xin Fu:
Bit-Flip Induced Latency Attacks in Object Detection. 6709-6718 - Jiarui Sun, M. Ugur Akcal, Girish Chowdhary, Wei Zhang:
MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning. 6719-6729 - Amit Baras, Alon Zolfi, Yuval Elovici, Asaf Shabtai:
QuantAttack: Exploiting Quantization Techniques to Attack Vision Transformers. 6730-6740 - Fumioki Sato, Hideaki Hayashi, Hajime Nagahara:
Multi-task Learning of Classification and Generation for Set-structured Data. 6741-6751 - Xiaowei Yu, Zhe Huang, Zao Zhang:
Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation. 6752-6761 - Frank Fundel, Johannes Schusterbauer, Vincent Tao Hu, Björn Ommer:
Distillation of Diffusion Features for Semantic Correspondence. 6762-6774 - Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez-Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman:
SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation. 6775-6784 - Neeresh Kumar Perla, Md. Iqbal Hossain, Afia Sajeeda, Ming Shao:
Are Exemplar-Based Class Incremental Learning Models Victim of Black-Box Poison Attacks? 6785-6794 - Hoin Jung, Xiaoqian Wang:
Towards On-the-Fly Novel Category Discovery in Dynamic Long-Tailed Distributions. 6795-6804 - Feng Cheng, Ziyang Wang, Yi-Lin Sung, Yan-Bo Lin, Mohit Bansal, Gedas Bertasius:
Dam: Dynamic Adapter Merging for Continual Video QA Learning. 6805-6817 - Son Minh Nguyen, Tran Duy Linh, Duc Viet Le, Paul J. M. Havinga:
Multi-Surrogate-Teacher Assistance for Representation Alignment in Fingerprint-Based Indoor Localization. 6818-6827 - Yu-Shan Tai, An-Yeu Andy Wu:
AMP-ViT: Optimizing Vision Transformer Efficiency with Adaptive Mixed-Precision Post-Training Quantization. 6828-6837 - Younggeol Cho, Youngrae Kim, Junho Yoon, Seunghoon Hong, Dongman Lee:
Feature Augmentation Based Test-Time Adaptation. 6838-6847 - David Tschirschwitz
, Volker Rodehorst:
Label Convergence: Defining an Upper Performance Bound in Object Recognition Through Contradictory Annotations. 6848-6857 - Tavis Shore, Oscar Mendez, Simon Hadfield:
SpaGBOL: Spatial-Graph-Based Orientated Localisation. 6858-6867 - Sethupathy Parameswaran, Yuan Fang, Chandan Gautam, Savitha Ramasamy, Xiaoli Li:
Learning to Identify Seen, Unseen and Unknown in the Open World: A Practical Setting for Zero-Shot Learning. 6868-6878 - Junki Mori
, Kosuke Kihara, Taiki Miyagawa, Akinori F. Ebihara, Isamu Teranishi, Hisashi Kashima:
Federated Source-Free Domain Adaptation for Classification: Weighted Cluster Aggregation for Unlabeled Data. 6879-6889 - Alin Dondera, Anuj Singh, Hadi Jamali Rad:
MAGMA: Manifold Regularization for MAEs. 6890-6899 - Fardad Dadboud, Hamid Azad, Varun Mehta, Miodrag Bolic, Iraj Mantegh:
DrIFT: Autonomous Drone Dataset with Integrated Real and Synthetic Data, Flexible Views, and Transformed Domains. 6900-6910 - Xiwen Chen, Huayu Li, Peijie Qiu, Wenhui Zhu, Rahul Amin, Abolfazl Razi:
RD-DPP: Rate-Distortion Theory Meets Determinantal Point Process to Diversify Learning Data Samples. 6911-6920 - Nicolas Harvey Chapman, Christopher F. Lehnert
, Will N. Browne, Feras Dayoub
:
Enhancing Embodied Object Detection with Spatial Feature Memory. 6921-6931 - Tom Pégeot, Eva Feillet, Adrian Popescu, Inna Kucher, Bertrand Delezoide:
Temporal Dynamics in Visual Data: Analyzing the Impact of Time on Classification Accuracy. 6932-6943 - Jayateja Kalla, Rohit Kumar, Soma Biswas:
TACLE: Task and Class-Aware Exemplar-Free Semi-Supervised Class Incremental Learning. 6944-6954 - Tobias Christian Nauen, Sebastian Palacio, Federico Raue, Andreas Dengel:
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers. 6955-6966 - Zorana Dozdor, Tomislav Hrkac, Zoran Kalafatic:
SV-data2vec: Guiding Video Representation Learning with Latent Skeleton Targets. 6967-6976 - Eric Yang Yu, Christopher Liao, Sathvik Ravi, Theodoros Tsiligkaridis, Brian Kulis:
Image-Caption Encoding for Improving Zero-Shot Generalization. 6977-6986 - Jisu Han, Jaemin Na, Wonjun Hwang:
Semantic Prompting with Image Token for Continual Learning. 6987-6997 - Henry Hölzemann, Torsten Fiolka:
Semantic Clustering of Image Retrieval Databases used for Visual Localization. 6998-7007 - Moritz Nottebaum, Matteo Dunnhofer, Christian Micheloni:
LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones. 7008-7018 - Maksim Zhdanov, Stanislav Dereka, Sergey Kolesnikov:
Identity Curvature Laplace Approximation for Improved Out-of-Distribution Detection. 7019-7028 - Dhanunjaya Varma Devalraju
, C. Chandra Sekhar:
Uncertainty-Guided Metric Learning Without Labels. 7029-7038 - Meghana Karri, Amit Soni Arya, Koushik Biswas, Nicolo Gennaro, Vedat Cicek, Gorkem Durak, Yuri S. Velichko, Ulas Bagci:
Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-Supervised Medical Image Segmentation. 7039-7048 - Yuguang Yao, Jiancheng Liu, Yifan Gong, Xiaoming Liu, Yanzhi Wang, Xue Lin, Sijia Liu:
Can Adversarial Examples be Parsed to Reveal Victim Model Information? 7049-7061 - Li-Ying Hung, Cooper Cheng-Yuan Ku:
Knockoff Branch: Model Stealing Attack via Adding Neurons in the Pre-Trained Model. 7062-7070 - Mikhail Papkov, Pavel Chizhov, Leopold Parts:
SwinIA: Self-Supervised Blind-Spot Image Denoising Without Convolutions. 7071-7080 - Anton Frolov, Florian Kleiner
, Christiane Rößler, Volker Rodehorst
:
Needles & Haystacks: Dataset and Benchmark for Domain-Agnostic Image-Based Rigid Slice-to-Volume Registration. 7081-7091 - Gustavo Adolfo Vargas Hakim, David Osowiechi, Mehrdad Noori, Milad Cheraghalikhani, Ali Bahri, Moslem Yazdanpanah, Ismail Ben Ayed, Christian Desrosiers:
CLIPArTT: Adaptation of CLIP to New Domains at Test Time. 7092-7101 - Kamalakar Vijay Thakare
, Lalit Lohani
, Kamakshya Prasad Nayak
, Debi Prosad Dogra, Heeseung Choi, Hyungjoo Jung, Ig-Jae Kim:
CLIPping Imbalances: A Novel Evaluation Baseline and PEARL Dataset for Pedestrian Attribute Recognition. 7102-7111 - Sahar Rahimi Malakshan, Mohammad Saeed Ebrahimi Saadabadi, Ali Dabouei, Nasser M. Nasrabadi:
Decomposed Distribution Matching in Dataset Condensation. 7112-7122 - Quazi Mishkatul Alam, Bilel Tarchoun, Ihsen Alouani
, Nael B. Abu-Ghazaleh:
Adversarial Attention Deficit: Fooling Deformable Vision Transformers with Collaborative Adversarial Patches. 7123-7132 - Adrian Iordache, Bogdan Alexe, Radu Tudor Ionescu:
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets. 7133-7142 - Vandan Gorade, Azad Singh
, Deepak Mishra:
OTCXR: Rethinking Self-supervised Alignment using Optimal Transport for Chest X-ray Analysis. 7143-7152 - Nicholas John Eliopoulos, Purvish Jajal, James C. Davis, Gaowen Liu, George K. Thiravathukal, Yung-Hsiang Lu:
Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge. 7153-7162 - Marina Ceccon, Davide Dalle Pezze, Alessandro Fabris, Gian Antonio Susto:
Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark. 7163-7172 - S. Divakar Bhat, Amit More, Mudit Soni, Bhuvan Aggarwal:
PC-GZSL: Prior Correction for Generalized Zero Shot Learning. 7173-7183 - Daehwan Kim, Hyungmin Kim, Daun Jeong, Sungho Suh, Hansang Cho:
SPACE: SPAtial-Aware Consistency rEgularization for Anomaly Detection in Industrial Applications. 7184-7194 - Sarmistha Das
, Basha Mujavarsheik, R. E. Zera Lyngkhoi, Sriparna Saha, Alka Maurya:
Deciphering the Complaint Aspects: Towards an Aspect-Based Complaint Identification Model with Video Complaint Dataset in Finance. 7195-7204 - Shubhi Shukla, Subhadeep Dalui, Manaar Alam, Shubhajit Datta
, Arijit Mondal, Debdeep Mukhopadhyay, Partha Pratim Chakrabarti:
Guardian of the Ensembles: Introducing Pairwise Adversarially Robust Loss for Resisting Adversarial Attacks in DNN Ensembles. 7205-7214 - Michelle Guo, Mia Tang, Hannah Cha, Ruohan Zhang, C. Karen Liu, Jiajun Wu:
CRAFT: Designing Creative and Functional 3D Objects. 7215-7224 - Chaitanya Animesh, Manmohan Chandraker:
Tuned Contrastive Learning. 7225-7234 - Flavien Armangeon, Thibaud Ehret, Enric Meinhardt-Llopis, Rafael Grompone von Gioi, Guillaume Thibault, Marc Petit, Gabriele Facciolo:
IRIS-VIS: A New Dataset for Visibility Estimation in an Industrial Environment. 7235-7243 - Xinglong Sun, Maying Shen, Hongxu Yin, Lei Mao, Pavlo Molchanov, José M. Álvarez:
Advancing Weight and Channel Sparsification with Enhanced Saliency. 7244-7255 - Tushar Kadam, Utkarsh Mishra, Aakarsh Malhotra:
SHIP: Structural Hierarchies for Instance-Dependent Partial Labels. 7256-7265 - Juan Pablo Lagos, Haider Ali, Adnan Faroque, Esa Rahtu
:
Heterogeneous Datasets for Unsupervised Image Anomaly Detection. 7266-7276 - Chandan Kumar Singh, Devesh Kumar, Vipul Sanap, Rajesh Sinha:
LLM-RSPF: Large Language Model-Based Robotic System Planning Framework for Domain Specific Use-cases. 7277-7286 - Mehran Hosseini, Peyman Hosseini:
GeoPos: A Minimal Positional Encoding for Enhanced Fine-Grained Details in Image Synthesis Using Convolutional Neural Networks. 7287-7297 - Sayanta Adhikari, Dupati Srikar Chandra, P. K. Srijith, Pankaj Wasnik, Naoyuki Onoe:
AdaPrefix++: Integrating Adapters, Prefixes and Hypernetwork for Continual Learning. 7298-7307 - Viti Mario, Nadiya Shvai, Arcadi Llanza, Amir Nakib:
A 0-Shot Self-Attention Mechanism for Accelerated Diagonal Attention. 7308-7315 - Juntae Kim, Sungwon Woo, Jongho Nang:
Relational Self-Supervised Distillation with Compact Descriptors for Image Copy Detection. 7316-7325 - Diogo Lavado, Ricardo Santos, André Coelho, João Santos, Alessandra Micheletti, Cláudia Soares:
Learning Under Noisy Labels, Spurious Points, and Diverse Structures: TS40K, a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission Systems. 7326-7336 - Minxia Xu, Han Yang, Bo Song, Weida Hu, Jinshui Miao, Erkang Cheng:
Cross Image Feature Perturbation with Pseudo Label Fusion for Semi-Supervised Medical Image Segmentation. 7337-7347 - Ayumu Saito, Prachi Kudeshia, Jiju Poovvancheri:
Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud. 7348-7357 - Camille Garcin, Maximilien Servajean, Alexis Joly, Joseph Salmon:
A Two-Head Loss Function for Deep Average-K Classification. 7358-7367 - Diana-Nicoleta Grigore, Mariana-Iuliana Georgescu, Jon Álvarez Justo, Tor Arne Johansen, Andreea Iuliana Ionescu, Radu Tudor Ionescu:
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers. 7368-7378 - Tanay Agrawal, Mohammed Guermal, Michal Balazia, François Brémond:
CM3T: Framework for Efficient Multimodal Learning for Inhomogeneous Interaction Datasets. 7379-7388 - Ragja Palakkadavath, Hung Le, Thanh Nguyen-Tang, Sunil Gupta, Svetha Venkatesh:
Fair Domain Generalization with Heterogeneous Sensitive Attributes Across Domains. 7389-7398 - Divya Saxena, Jiannong Cao, Jiahao Xu, Tarun Kulshrestha:
Data-Efficient Alignment in Medical Imaging via Reconfigurable Generative Networks. 7399-7408 - Stefan Smeu, Elena Burceanu, Emanuela Haller, Andrei Liviu Nicolicioiu:
Robust Novelty Detection Through Style-Conscious Feature Ranking. 7409-7418 - Sankalp Nagaonkar, Achyut Mani Tripathi, Ashish Mishra:
When Visual State Space Model Meets Backdoor Attacks. 7419-7428 - Arnisha Khondaker, Nilanjan Ray:
Learning Instance-Specific Parameters of Black-Box Models Using Differentiable Surrogates. 7429-7438 - Daniela Ivanova, Marco Aversa, Paul Henderson, John Williamson:
ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage. 7439-7449 - Göksel Mert Çökmez, Yang Zhang, Christopher Schroers, Tunç Ozan Aydin:
CLIP-Fusion: A Spatio-Temporal Quality Metric for Frame Interpolation. 7450-7459 - Haleh Damirchi, Ali Etemad, Michael A. Greenspan:
Socially-Informed Reconstruction for Pedestrian Trajectory Forecasting. 7460-7469 - Colton R. Crum, Adam Czajka:
MENTOR: Human Perception-Guided Pretraining for Increased Generalization. 7470-7479 - Savitha Sam Abraham, Sourav Garg, Feras Dayoub
:
To Ask or Not to Ask? Detecting Absence of Information in Vision and Language Navigation. 7480-7489 - Xuhui Kang
, Yen-Ling Kuo:
Incorporating Task Progress Knowledge for Subgoal Generation in Robotic Manipulation through Image Edits. 7490-7499 - Reza Akbarian Bafghi, Nidhin Harilal, Claire Monteleoni, Maziar Raissi:
MixDiff: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations. 7500-7511 - Zhongyao Cheng, Fang Wu, Peisheng Qian, Ziyuan Zhao, Xulei Yang:
AIC3DOD: Advancing Indoor Class-Incremental 3D Object Detection with Point Transformer Architecture and Room Layout Constraints. 7512-7521 - Maheswar Bora, Saurabh Atreya, Aritra Mukherjee, Abhijit Das:
KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder. 7522-7532 - Vishnuprasadh Kumaravelu, P. K. Srijith, Sunil Gupta:
EvoCL: Continual Learning over Evolving Domains. 7533-7541 - Cheeun Hong, Sungyong Baik, Junghun Oh, Kyoung Mu Lee:
Difficulty, Diversity, and Plausibility: Dynamic Data-Free Quantization. 7542-7551 - Marco Blanchini, Giovanna Maria Dimitri
, Lydia Abady, Benedetta Tondi, Tarcisio Lancioni, Mauro Barni:
Semiotic-Based Construction of a Large Emotional Image Dataset with Neutral Samples. 7552-7561 - Wojciech Lapacz, Daniel Marczak, Filip Szatkowski, Tomasz Trzcinski:
Exploring the Stability Gap in Continual Learning: The Role of the Classification Head. 7562-7571 - Aniana Cruz, Guilherme G. Schardong, Luiz Schirmer, João Marcos, Farhad Shadmand, Nuno Gonçalves:
RiemStega: Covariance-Based Loss for Print-Proof Transmission of Data in Images. 7572-7581 - Yanqi Qiao, Dazhuang Liu, Rui Wang, Kaitai Liang:
Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm. 7582-7592 - Krishna Kanth Nakka, Alexandre Alahi:
NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability. 7593-7604 - David Tschirschwitz
, Volker Rodehorst
:
CISOL: An Open and Extensible Dataset for Table Structure Recognition in the Construction Industry. 7605-7613 - Mingxian Li, Hao Sun, Yingtie Lei
, Xiaofeng Zhang, Yihang Dong, Yilin Zhou, Zimeng Li, Xuhang Chen:
High-Fidelity Document Stain Removal via A Large-Scale Real-World Dataset and A Memory-Augmented Transformer. 7614-7624 - Eva Feillet, Adrian Popescu, Céline Hudelot:
A Reality Check on Pre-training for Exemplar-free Class-Incremental Learning. 7625-7636 - Tamara R. Lenhard, Andreas Weinmann
, Kai Franke
, Tobias Koch
:
SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection. 7637-7647 - Olaf Wysocki, Yue Tan, Thomas Froech, Yan Xia, Magdalena Wysocki, Ludwig Hoegner
, Daniel Cremers, Christoph Holst:
ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset. 7648-7658 - Yuwei Chen, Ming-Ching Chang, Matthias Kirchner, Zhenfei Zhang, Xin Li, Arslan Basharat, Anthony Hoogs:
A Semantically Impactful Image Manipulation Dataset: Characterizing Image Manipulations Using Semantic Significance. 7659-7668 - Varun Burde, Assia Benbihi, Pavel Burget, Torsten Sattler:
Comparative Evaluation of 3D Reconstruction Methods for Object Pose Estimation. 7669-7681 - Yachuan Li, Xavier Soria Poma, Yun Bai, Qian Xiao, Chaozhi Yang, Guanlin Li, Zongmin Li:
EDMB: Edge Detector with Mamba. 7682-7691 - Alberto Presta, Enzo Tartaglione, Attilio Fiandrotti, Marco Grangetto, Pamela C. Cosman:
Efficient Progressive Image Compression with Variance-Aware Masking. 7692-7700 - Alex Tianyi Xu, Alex Wilf, Paul Pu Liang, Alexander Obolenskiv, Daniel Fried, Louis-Philippe Morency:
Comparative Knowledge Distillation. 7701-7710 - Evgenii Kruzhkov, Sven Behnke:
LiLMaps: Learnable Implicit Language Maps. 7711-7720 - Sakshi Choudhary, Sai Aparna Aketi, Kaushik Roy:
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data. 7731-7741 - Manuel Knott, Ignacio Serna, Ethan Mann, Pietro Perona:
A Rapid Test for Accuracy and Bias of Face Recognition Technology. 7742-7751 - Marius Kästingschäfer, Théo Gieruc, Sebastian Bernhard, Dylan Campbell, Eldar Insafutdinov, Eyvaz Najafli, Thomas Brox:
SEED4D: A Synthetic Ego-Exo Dynamic 4D Data Generator, Driving Dataset and Benchmark. 7752-7764 - Xuesong Li
, Zeeshan Hayder, Ali Zia, Connor Cassidy, Shiming Liu, Warwick Stiller, Eric A. Stone, Warren Conaty, Lars Petersson, Vivien Rolland:
BioNet and NeFF: Crop Biomass Prediction from Point Clouds to Drone Imagery. 7765-7775 - Peyman Rostami, Nilotpal Sinha, Nidhal Eddine Chenni
, Anis Kacem, Abd El Rahman Shabayek, Carl Shneider, Djamila Aouada:
Information Theoretic Pruning of Coupled Channels in Deep Neural Networks. 7776-7786 - Yibo Zhong, Yao Zhou:
Rethinking Low-Rank Adaptation in Vision: Exploring Head-Level Responsiveness across Diverse Tasks. 7787-7796 - Filippo Botti, Alex Ergasti, Leonardo Rossi, Tomaso Fontanini, Claudio Ferrari, Massimo Bertozzi, Andrea Prati:
Mamba-ST: State Space Model for Efficient Style Transfer. 7797-7806 - Marco Paul E. Apolinario
, Arani Roy, Kaushik Roy:
LLS: Local Learning Rule for Deep Neural Networks Inspired by Neural Activity Synchronization. 7807-7816 - Kai Chen, Yanze Li, Wenhua Zhang, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-Yan Yeung, Huchuan Lu, Xu Jia:
Automated Evaluation of Large Vision-Language Models on Self-Driving Corner Cases. 7817-7826 - Tejaswini Medi, Steffen Jung, Margret Keuper:
FAIR-TAT: Improving Model Fairness Using Targeted Adversarial Training. 7827-7836 - Xilin He, Cheng Luo, Qinliang Lin, Weicheng Xie, Muhammad Haris Khan, Siyang Song, Linlin Shen:
Towards Robust Training via Gradient-Diversified Backpropagation. 7847-7856 - Arjun Sridhar, Yiran Chen:
Delta-NAS: Difference of Architecture Encoding for Predictor-Based Evolutionary Neural Architecture Search. 7857-7865 - Sagar M. Waghmare, Kimberly Wilber, Dave Hawkey, Xuan Yang, Matthew Wilson, Stephanie Debats, Cattalyya Nuengsigkapian, Astuti Sharma, Lars Pandikow, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko:
SANPO: A Scene Understanding, Accessibility and Human Navigation Dataset. 7866-7875 - Hrishav Bakul Barua, Kalin Stefanov, KokSheik Wong, Abhinav Dhall, Ganesh Krishnasamy:
GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction. 7876-7886 - Nguyen Son Dinh, Tuan Dung Nguyen, Duc Tri Tran, Nguyen Dang Huy Pham, Thuan Hieu Tran, Ngoc Anh Tong, Quang Huy Hoang, Phi Le Nguyen:
Sign Language Recognition: A Large-scale Multi-view Dataset and Comprehensive Evaluation. 7887-7897 - Purvish Jajal, Nick John Eliopoulos, Benjamin Shiue-Hal Chou, George K. Thiravathukal, James C. Davis, Yung-Hsiang Lu:
Token Turing Machines are Efficient Vision Models. 7898-7907 - Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah:
ROADS: Robust Prompt-Driven Multi-Class Anomaly Detection Under Domain Shift. 7908-7917 - Lingjie Yi, Tao Sun, Yikai Zhang, Songzhu Zheng, Weimin Lyu, Haibin Ling, Chao Chen:
PivotAlign: Improve Semi-Supervised Learning by Learning Intra-Class Heterogeneity and Aligning with Pivots. 7918-7927 - Komal Kumar, Snehashis Chakraborty, Dwarikanath Mahapatra, Behzad Bozorgtabar, Sudipta Roy:
Self-Supervised Anomaly Segmentation via Diffusion Models with Dynamic Transformer UNet. 7928-7938 - Moshe Kimhi, David Vainshtein, Chaim Baskin, Dotan Di Castro:
Robot Instance Segmentation with Few Annotations for Grasping. 7939-7949 - Trung-Anh Dang, Vincent Nguyen, Ngoc-Son Vu, Christel Vrain:
Memory-efficient Continual Learning with Neural Collapse Contrastive. 7950-7959 - Jiahao Xu
, Zikai Zhang
, Rui Hu:
Identify Backdoored Model in Federated Learning via Individual Unlearning. 7960-7969 - Luca Ciampi
, Nicola Messina, Matteo Pierucci, Giuseppe Amato, Marco Avvenuti, Fabrizio Falchi
:
Mind the Prompt: A Novel Benchmark for Prompt-Based Class-Agnostic Counting. 7970-7979 - Spencer Carmichael, Manohar Bhat, Mani Ramanagopal, Austin Buchan, Ram Vasudevan, Katherine A. Skinner:
TRNeRF: Restoring Blurry, Rolling Shutter, and Noisy Thermal Images with Neural Radiance Fields. 7980-7990 - Sergey Korchagin, Ekaterina Zaychenkova, Aleksei Khalin, Aleksandr Yugay, Alexey Zaytsev, Egor I. Ershov:
Improving Uncertainty Estimation with Confidence-Aware Training Data. 7991-8001 - Jaisidh Singh, Ishaan Shrivastava, Mayank Vatsa, Richa Singh, Aparna Bharati:
Learning the Power of "No": Foundation Models with Negations. 8002-8012 - Guiqiu Liao, Matjaz Jogan, Sai Koushik, Eric Eaton, Daniel A. Hashimoto:
Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video. 8013-8023 - Tianyi Ma, Maoying Qiao:
Disentangle Source and Target Knowledge for Continual Test-Time Adaptation. 8024-8034 - Rini Smita Thakur, Vinod K. Kurmi:
Uncertainty and Energy based Loss Guided Semi-Supervised Semantic Segmentation. 8035-8045 - Jun Chen, Faizan Farooq Khan, Ming Hu, Ammar Sherif, Zongyuan Ge, Boyang Li, Mohamed Elhoseiny
:
Local Masked Reconstruction for Efficient Self-Supervised Learning on High-Resolution Images. 8046-8056 - Shijie Wang, Dahun Kim, Ali Taalimi, Chen Sun, Weicheng Kuo:
Learning Visual Grounding from Generative Vision and Language Model. 8057-8067 - Payal Mohadikar, Ye Duan
:
OmniDiffusion: Reformulating 360 Monocular Depth Estimation Using Semantic and Surface Normal Conditioned Diffusion. 8068-8078 - Jonás Serých
, Michal Neoral, Jiri Matas:
MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation. 8079-8089 - Yuwen Heng, Yihong Wu, Srinandan Dasmahapatra, Hansung Kim:
MatSpectNet: Material Segmentation Network with Domain-Aware and Physically-Constrained Hyperspectral Reconstruction. 8090-8100 - Jiahao Zhang, Frederic Z. Zhang, Cristian Rodriguez, Yizhak Ben-Shabat, Anoop Cherian, Stephen Gould:
Temporally Grounding Instructional Diagrams in Unconstrained Videos. 8101-8111 - Shangbo Mao, Deepu Rajan:
An Encoder-Agnostic Weakly Supervised Method For Describing Textures. 8112-8121 - Khurram Azeem Hashmi, Talha Uddin Sheikh, Didier Stricker, Muhammad Zeshan Afzal
:
Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection. 8122-8133 - Nan Peng, Xun Zhou, Mingming Wang, Xiaojun Yang, Songming Chen, Guisong Chen:
PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction. 8134-8143 - George Leotescu, Alin-Ionut Popa, Diana Grigore, Daniel Voinea, Pietro Perona:
Self-Supervised Incremental Learning of Object Representations from Arbitrary Image Sets. 8144-8154 - Yuhao Lin, Haiming Xu, Lingqiao Liu, Javen Qinfeng Shi:
A Simple-but-Effective Baseline for Training-Free Class-Agnostic Counting. 8155-8164 - Xinpeng Liu, Hiroaki Santo, Yosuke Toda, Fumio Okura:
TreeFormer: Single-View Plant Skeleton Estimation via Tree-Constrained Graph Generation. 8165-8175 - Yanan Gu, Muli Yang, Xu Yang, Kun Wei, Hongyuan Zhu, Gabriel James Goenawan, Cheng Deng:
Dynamic Adapter Tuning for Long-Tailed Class-Incremental Learning. 8176-8185 - Timur Z. Mamedov
, Anton Konushin, Vadim Konushin:
ReMix: Training Generalized Person Re-Identification on a Mixture of Data. 8186-8196 - Shen Zheng, Anurag Ghosh, Srinivasa G. Narasimhan:
Instance-Warp: Saliency Guided Image Warping for Unsupervised Domain Adaptation. 8197-8206 - Riku Inoue
, Masamitsu Tsuchiya, Yuji Yasui:
Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection. 8207-8216 - Philipp Allgeuer, Kyra Ahrens, Stefan Wermter:
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion. 8217-8228 - Ryo Fujii, Ryo Hachiuma, Hideo Saito:
CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density Forecasting. 8229-8238 - Hong Liu, Yuta Nakashima, Noboru Babaguchi:
Paladin: Understanding Video Intentions in Political Advertisement Videos. 8239-8248 - Clement Tan, Chai Kiat Yeo, Cheston Tan, Basura Fernando:
Inferring Past Human Actions in Homes with Abductive Reasoning. 8249-8258 - Gyuseong Lee, Wooseok Jang, Jinhyeon Kim, Jaewoo Jung, Seungryong Kim:
Domain Generalization using Large Pretrained Models with Mixture-of-Adapters. 8259-8269 - Hankyul Kang, Jongbin Ryu:
Enriching Local Patterns with Multi-Token Attention for Broad-Sight Neural Networks. 8270-8279 - Jaehyun Choi, Junwon Ko, Dong-Jae Lee, Junmo Kim:
AH-OCDA: Amplitude-Based Curriculum Learning and Hopfield Segmentation Model for Open Compound Domain Adaptation. 8280-8290 - Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong:
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection. 8291-8301 - Mustafa Munir, Md Mostafijur Rahman, Radu Marculescu:
RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone. 8302-8312 - Avi Gupta, Koteswar Rao Jerripothula
, Tammam Tillo:
CIRCOD: Co-Saliency Inspired Referring Camouflaged Object Discovery. 8313-8323 - Takehiko Ohkawa, Takuma Yagi, Taichi Nishimura, Ryosuke Furuta, Atsushi Hashimoto, Yoshitaka Ushiku, Yoichi Sato:
Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos. 8324-8335 - Raghav Goyal, Wan-Cyuan Fan, Mennatullah Siam, Leonid Sigal:
TAM-VT: Transformation-Aware Multi-Scale Video Transformer for Segmentation and Tracking. 8336-8345 - Bokyeung Lee, Jonghwan Hong, Hyunuk Shin, Bonhwa Ku, Hanseok Ko:
Dropout Connects Transformers and CNNs: Transfer General Knowledge for Knowledge Distillation. 8346-8355 - Tsung-Yu Chen, Luyu Yang, Tzu-Yu Chuang, Shang-Hong Lai:
CACE: Sim-to-Real Indoor 3D Semantic Segmentation via Context-Aware Augmentation and Consistency Enforcement. 8356-8367 - Koichiro Ito:
Feature Design for Bridging SAM and CLIP Toward Referring Image Segmentation. 8368-8378 - Seonguk Seo, Bohyung Han:
Re-Evaluating Group Robustness via Adaptive Class-Specific Scaling. 8379-8388 - Teppei Kurita, Yuhi Kondo, Legong Sun, Takayuki Sasaki, Sho Nitta, Yasuhiro Hashimoto, Yoshinori Muramatsu, Yusuke Moriuchi:
Revisiting Disparity from Dual-Pixel Images: Physics-Informed Lightweight Depth Estimation. 8389-8399 - Jiacheng Li, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan, Zhiwei Xiong:
Multi-Spectral Image Color Reproduction. 8400-8409 - Jonathan Lee, Bolivar Solarte, Chin-Hsuan Wu, Jin-Cheng Jhang, Fu-En Wang, Yi-Hsuan Tsai, Min Sun:
uLayout: Unified Room Layout Estimation for Perspective and Panoramic Images. 8410-8419 - Thanh-Son Nguyen, Hong Yang, Basura Fernando:
Effective Scene Graph Generation by Statistical Relation Distillation. 8420-8430 - Christoph Reinders, Radu Berdan, Beril Besbinar, Junji Otsuka, Daisuke Iso:
RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation. 8431-8443 - Aleyna Kütük, Tevfik Metin Sezgin:
Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation. 8444-8453 - Jiyang Yu, Tianhao Zhang, Fuhao Shi, Lei He, Chia-Kai Liang:
SensorFlow: Sensor and Image Fused Video Stabilization. 8454-8463 - Jaehyun Park, Nam Ik Cho:
Explicit Guidance for Robust Video Frame Interpolation Against Discontinuous Motions. 8464-8473 - Poulami Sinhamahapatra, Franziska Schwaiger, Shirsha Bose, Huiyu Wang, Karsten Roscher, Stephan Günnemann:
Finding Dino: A Plug-and-Play Framework for Zero-Shot Detection of Out-of-Distribution Objects Using Prototypes. 8474-8483 - Yeshwanth Kumar Adimoolam, Charalambos Poullis, Melinos Averkiou:
Pix2Poly: A Sequence Prediction Method for End-to-End Polygonal Building Footprint Extraction from Remote Sensing Imagery. 8484-8493 - Md. Alimoor Reza, Eric Manley, Sean Chen, Sameer Chaudhary, Jacob Elafros:
SegBuilder: A Semi-Automatic Annotation Tool for Segmentation. 8494-8503 - Ali Bahri, Mehrdad Noori, Gustavo Adolfo Vargas Hakim, Ismail Ben Ayed, Milad Cheraghalikhani, David Osowiechi, Christian Desrosiers, Moslem Yazdanpanah:
FDS: Feedback-Guided Domain Synthesis with Multi-Source Conditional Diffusion Models for Domain Generalization. 8504-8514 - Grégoire Petit, Nathan Palluau, Axel Bauer, Clemens Dlaska:
EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation Using Synthetic Data. 8515-8524 - Piotr Teterwak, Kuniaki Saito, Theodoros Tsiligkaridis, Kate Saenko, Bryan A. Plummer:
ERM++: An Improved Baseline for Domain Generalization. 8525-8535 - Adam Pardyl, Grzegorz Kurzejamski, Jan Olszewski, Tomasz Trzcinski, Bartosz Zielinski:
Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers. 8536-8545 - Nimeshika Udayangani, Hadi M. Dolatabadi, Sarah M. Erfani, Christopher Leckie:
Exploiting Inter-Sample Information for Long-Tailed Out-of-Distribution Detection. 8546-8555 - Ahmad Darkhalil, Rhodri Guerrier, Adam W. Harley, Dima Damen
:
EgoPoints: Advancing Point Tracking for Egocentric Videos. 8556-8565 - Atif Belal, Akhil Meethal, Francisco Perdigon Romero, Marco Pedersoli, Eric Granger:
Attention-Based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors. 8566-8575 - Ripon Kumar Saha, Scott McCloskey, Suren Jayasuriya:
MetaVIn: Meteorological and Visual Integration for Atmospheric Turbulence Strength Estimation. 8576-8585 - Minjoon Jung, Youwon Jang, Seongho Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang:
Background-Aware Moment Detection for Video Moment Retrieval. 8586-8596 - Yuka Ogino, Yuho Shoji, Takahiro Toizumi, Atsushi Ito:
ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing. 8597-8605 - Jan Olszewski, Dawid Rymarczyk, Piotr Wójcik, Mateusz Pach, Bartosz Zielinski:
TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration. 8606-8616 - Li Sun, Chaitanya Ahuja, Peng Chen, Matt D'Zmura, Kayhan Batmanghelich, Philip Bontrager:
Multi-Modal Large Language Models are Effective Vision Learners. 8617-8626 - Haojie Mu, Burhan Ul Tayyab, Nicholas Chua:
SpiralMLP: A Lightweight Vision MLP Architecture. 8627-8637 - Jinpeng He, Biyuan Liu, Huaixin Chen:
HDPNet: Hourglass Vision Transformer with Dual-Path Feature Pyramid for Camouflaged Object Detection. 8638-8647 - Filippos Gouidis, Konstantinos E. Papoutsakis, Theodore Patkos, Antonis A. Argyros, Dimitris Plexousakis:
Recognizing Unseen States of Unknown Objects by Leveraging Knowledge Graphs. 8648-8659 - Christian Witte, Jens Behley, Cyrill Stachniss, Marvin Raaijmakers:
Epipolar Attention Field Transformers for Bird's Eye View Semantic Segmentation. 8660-8669 - Akshaya Athwale, Ichrak Shili, Émile Bergeron, Ola Ahmad, Jean-François Lalonde:
DarSwin-Unet: Distortion Aware Architecture. 8670-8680 - Abdelrahman M. Shaker, Syed Talal Wasim, Martin Danelljan, Salman H. Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan:
Efficient Video Object Segmentation via Modulated Cross-Attention Memory. 8681-8690 - André Sacilotti, Samuel Felipe dos Santos, Nicu Sebe
, Jurandy Almeida:
Transferable-Guided Attention Is All You Need for Video Domain Adaptation. 8691-8701 - Zahidul Islam, Sujoy Paul, Mrigank Rochan:
Unsupervised Video Highlight Detection by Learning from Audio and Visual Recurrence. 8702-8711 - Junsu Choi, Jin-Seop Lee, Noo-Ri Kim, SuHyun Yoon, Jee-Hyong Lee:
Feature-Level and Spatial-Level Activation Expansion for Weakly-Supervised Semantic Segmentation. 8712-8722 - Weihan Luo, Anagh Malik, David B. Lindell:
Transientangelo: Few-Viewpoint Surface Reconstruction Using Single-Photon Lidar. 8723-8733 - Narongthat Thanyawet, Photchara Ratsamee, Yuki Uranishi, Haruo Takemura:
Detective Networks: Enhancing Disaster Recognition in Images Through Attention Shifting Using Optimal Masking. 8734-8743 - Yaxin Feng, Yuan Lan, Luchan Zhang, Yang Xiang:
ElasticLaneNet: An Efficient Geometry-Flexible Lane Detection Framework. 8744-8753 - Jingyi Xu, Hieu Le, Dimitris Samaras:
Learning to Count from Pseudo-Labeled Segmentation. 8754-8763 - Ci-Siang Lin, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen:
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation. 8764-8774 - Katherine Xu, Lingzhi Zhang, Jianbo Shi:
Detecting Origin Attribution for Text-to-Image Diffusion Models. 8775-8785 - Alessandro D'Amelio, Giuseppe Cartella, Vittorio Cuculo, Manuele Lucchi, Marcella Cornia, Rita Cucchiara, Giuseppe Boccignone:
TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes. 8786-8795 - Tianlong Tan, Bin Chen, Hongliang Cao, Chenggang Yan, Yike Ma, Feng Dai:
DASC-SPT: Towards Self-Supervised Panoramic Semantic Segmentation. 8796-8805 - Peter Hönig, Stefan Thalhammer, Jean-Baptiste Weibel, Matthias Hirschmanner, Markus Vincze:
Shape-Biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation. 8806-8815 - Lei Shi, Paul C. Bürkner, Andreas Bulling:
ActionDiffusion: An Action-Aware Diffusion Model for Procedure Planning in Instructional Videos. 8816-8825 - Hung Huy Nguyen, Pooyan Rahmanzadehgervi, Long Mai, Anh Totti Nguyen:
Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence. 8826-8833 - Calvin Glisson, Qiuxiao Chen:
HSDA: High-Frequency Shuffle Data Augmentation for Bird's-Eye-View Map Segmentation. 8834-8843 - Pongsakorn Jirachanchaisiri, Nam Tuan Ly, Atsuhiro Takasu:
TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images. 8844-8852 - Roberto Amoroso, Gengyuan Zhang, Rajat Koner, Lorenzo Baraldi, Rita Cucchiara, Volker Tresp:
Perceive. Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries. 8853-8862 - Anindya Sundar Das, Guansong Pang, Monowar H. Bhuyan:
Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination. 8863-8872 - Yizhe Ruan, Lin Gu, Yusuke Kurose, Junichi Iho, Youji Tokunaga, Makoto Horie, Yusaku Hayashi, Keisuke Nishizawa, Yasushi Koyama, Tatsuya Harada:
Physiology-Aware PolySnake for Coronary Vessel Segmentation. 8873-8882 - Bowen Jiang, Zhijun Zhuang, Shreyas S. Shivakumar, Camillo J. Taylor:
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge. 8883-8894 - Anis Amziane:
Learning Deep Illumination-Robust Features from Multispectral Filter Array Images. 8895-8904 - Farhad G. Zanjani, Hong Cai, Hanno Ackermann, Leyla Mirvakhabova, Fatih Porikli:
Planar Gaussian Splatting. 8905-8914 - Takuya Asakura, Nakamasa Inoue, Koichi Shinoda:
Diffusion-Based Generative Regularization for Supervised Discriminative Learning. 8915-8926 - Skanda Bharadwaj, Robert T. Collins, Yanxi Liu:
Recurrence-Based Vanishing Point Detection. 8927-8936 - Geonu Lee, Yonghyun Jeong, Haneol Jang, Youngjoon Yoo:
Domain-Generalized Object Anti-Spoofing: Bridging Gaps and Patch Selection for Robust Detection Across Domains. 8937-8946 - Juho Jung, Migyeong Yang, Hyunseon Won, Jiwon Kim, Jeong Mo Han, Joon Seo Hwang, Daniel Duck-Jin Hwang, Jinyoung Han:
CAMEL: Confidence-Aware Multi-Task Ensemble Learning with Spatial Information for Retina OCT Image Classification and Segmentation. 8947-8957 - Sangyeon Kim, Sangkuk Lee, Jeesoo Kim, Nojun Kwak:
TPD-STR: Text Polygon Detection with Split Transformers. 8958-8967 - Hila Levi, Guy Heller, Dan Levi:
FOR: Finetuning for Object Level Open Vocabulary Image Retrieval. 8968-8979 - Simon Thomine, Hichem Snoussi:
Single-Layer Distillation with Fourier Convolutions for Texture Anomaly Detection. 8980-8989 - Kha Nhat Le
, Hoang-Tuan Nguyen, Hung Tien Tran, Thanh Duc Ngo:
Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition. 8990-9000 - Suhas Srinath, Aditya Chandrasekar, Hemang Jamadagni, Rajiv Soundararajan, Prathosh AP:
UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors. 9001-9012 - Maxime Fontana, Michael W. Spratling, Miaojing Shi:
Optimizing Dense Visual Predictions Through Multi-Task Coherence and Prioritization. 9013-9022 - Heitor Rapela Medeiros, David Latortue, Eric Granger, Marco Pedersoli:
Mixed Patch Visible-Infrared Modality Agnostic Object Detection. 9023-9032 - Ioannis Sarridis, Christos Koutlis, Giorgos Kordopatis-Zilos, Ioannis Kompatsiaris, Symeon Papadopoulos:
InDistill: Information flow-preserving knowledge distillation for model compression. 9033-9042 - Rohit K. Bharadwaj, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan:
Enhancing Novel Object Detection via Cooperative Foundational Models. 9043-9052 - Yoko Sogabe, Shiori Sugimoto, Ayumi Matsumoto, Masaki Kitahara:
Pre-capture Privacy via Adaptive Single-Pixel Imaging. 9053-9062 - Floyd Hepburn-Dickins, Mark W. Jones, Mike Edwards, Jay Paul Morgan, Steve Bell:
SIGNN - Star Identification Using Graph Neural Networks. 9063-9072 - Seunghwan Choi, Jooyeol Yun, Jeonghoon Park, Jaegul Choo:
Disentangling Subject-Irrelevant Elements in Personalized Text-to-Image Diffusion via Filtered Self-Distillation. 9073-9082 - Wei-Jhe Huang, Min-Hung Chen, Shang-Hong Lai:
Spatio-Temporal Context Prompting for Zero-Shot Action Detection. 9083-9092 - Sachin Verma, Frank Lindseth, Gabriel Kiss:
SegDesicNet: Lightweight Semantic Segmentation in Remote Sensing with Geo-Coordinate Embeddings for Domain Adaptation. 9093-9104 - Arushi Rai, Adriana Kovashka:
Rubric-Constrained Figure Skating Scoring. 9105-9113 - Junyoung Hong, Hyeri Yang, Ye Ju Kim, Haerim Kim, Shinwoong Kim, Euna Shim, Kyungjae Lee
:
D2FP: Learning Implicit Prior for Human Parsing. 9114-9124 - Aniket Roy, Anshul Shah, Ketul Shah, Anirban Roy, Rama Chellappa:
Cap2Aug: Caption Guided Image data Augmentation. 9125-9135 - Zijun He, Lishun Wang, Ziyi Meng, Xin Yuan:
Self-supervised Learning with Spectral Low-Rank Prior for Hyperspectral Image Reconstruction. 9136-9145 - Roman Colman, Minh Vu, Manish Bhattarai, Martin Ma, Hari S. Viswanathan, Daniel O'Malley, Javier E. Santos:
PatchFinder: Leveraging Visual Language Models for Accurate Information Retrieval Using Model Uncertainty. 9146-9155 - Debanjan Goswami, Shayok Chakraborty:
Active Learning for Image Segmentation with Binary User Feedback. 9156-9165 - Shin Ishihara, Imari Sato:
Per-Pixel Solution of Multispectral Photometric Stereo. 9166-9175 - Sanaz Karimijafarbigloo, Sina Ghorbani Kolahi, Reza Azad, Ulas Bagci, Dorit Merhof:
Frequency-Domain Refinement of Vision Transformers for Robust Medical Image Segmentation Under Degradation. 9176-9185 - Anirban Roy, Adam D. Cobb, Ramneet Kaur, Sumit Jha, Nathaniel D. Bastian, Alexander M. Berenbeim, Robert H. Thomson, Iain Cruickshank, Alvaro Velasquez, Susmit Jha:
Zero-Shot Detection of Out-of-Context Objects Using Foundation Models. 9186-9195 - Mujing Li, Guanjie Wang, Xingguang Zhang, Qifeng Liao, Chenxi Xiao:
D-LUT: Photorealistic Style Transfer via Diffusion Process. 9206-9214 - Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando:
Situational Scene Graph for Structured Human-Centric Situation Understanding. 9215-9225 - Zhuo Cao, Bingqing Zhang, Heming Du, Xin Yu
, Xue Li
, Sen Wang
:
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. 9226-9236 - David Pujol-Perich, Albert Clapés, Sergio Escalera:
SADA: Semantic Adversarial Unsupervised Domain Adaptation for Temporal Action Localization. 9237-9247 - Jingyu Song, Xudong Chen, Liupei Lu, Jie Li, Katherine A. Skinner:
MemFusionMap: Working Memory Fusion for Online Vectorized HD Map Construction. 9248-9257 - Tz-Ying Wu, Kyle Min, Subarna Tripathi, Nuno Vasconcelos:
Ego-VPA: Egocentric Video Understanding with Parameter-Efficient Adaptation. 9258-9268 - Sai Bhargav Rongali, Mohamad Hassan N C, Ankit Jha, Neha Bhargava, Saurabh Prasad, Biplab Banerjee:
Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering. 9269-9279 - Liang Chen, Weihua Chen, Xin Zhao
, Junyan Wang, Lijun Cao, Junge Zhang:
Distribution Optimization Under Gaussian Hypothesis for Domain Adaptive Semantic Segmentation. 9280-9290 - Lucas Jaffe
, Avideh Zakhor:
Swap Path Network for Robust Person Search Pre-training. 9291-9301 - Frano Rajic, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu:
Segment Anything Meets Point Tracking. 9302-9311 - Raghavendra Ramachandra, Sushma Venkatesh, Guoqiang Li:
PoolAtnRes: Towards Generalisable Differential Morphing Attack Detection. 9312-9321 - Jeffri Murrugarra-Llerena
, Cláudio R. Jung:
Noise-Aware Evaluation of Object Detectors. 9322-9331 - Yidan Shen, Yu Wen, Chen Zhang, Xin Fu, Renjie Hu:
MVMD: A Multi-View Approach for Enhanced Mirror Detection. 9332-9341 - Jiaoyang Yin, Bin Fan, Chao Xu, Tiejun Huang, Boxin Shi:
Spk2ImgMamba: Spiking Camera Image Reconstruction with Multi-Scale State Space Models. 9342-9352 - Hanning Chen, Yang Ni, Wenjun Huang, Yezi Liu, Sungheon Jeong, Fei Wen, Nathaniel D. Bastian, Hugo Latapie, Mohsen Imani:
VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation. 9353-9363 - Jimut B. Pal
, Shantanu Welling, Himali Saini, Suyash P. Awate:
Reviving Poor Object Segmentations in OOD Medical Images using Variational-Deep-PCA Modeling on Segmentation Maps with Sampling-Free Learning. 9364-9373 - Sungyeon Kim, Donghyun Kim, Suha Kwak:
Learning Unified Distance Metric Across Diverse Data Distributions with Parameter-Efficient Transfer Learning. 9374-9384 - Yue Ma, Xiaodong Cun, Sen Liang, Jinbo Xing, Yingqing He, Chenyang Qi, Siran Chen, Qifeng Chen:
MagicStick: Controllable Video Editing via Control Handle Transformations. 9385-9395 - Zehua Cheng, Di Yuan, Wenhu Zhang, Thomas Lukasiewicz:
Effective and Efficient Medical Image Segmentation with Hierarchical Context Interaction. 9396-9405 - Jeongseok Hyun, Su Ho Han, Hyolim Kang, Joon-Young Lee, Seon Joo Kim:
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization. 9406-9415 - Junwen Chen, Yingcheng Wang, Keiji Yanai:
Focusing on what to Decode and what to Train: SOV Decoding with Specific Target Guided DeNoising and Vision Language Advisor. 9416-9425 - Hayoung Park, Choongsang Cho, Guisik Kim:
On the Importance of Dual-Space Augmentation for Domain Generalized Object Detection. 9426-9436 - Junbo Jang, Chanyeong Park, Heegwang Kim, Jiyoon Lee, Joonki Paik:
Multispectral Object Detection Enhanced by Cross-Modal Information Complementary and Cosine Similarity Channel Resampling Modules. 9437-9446 - Jiin Im, Yongho Son, Je Hyeong Hong:
FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data. 9447-9456 - Subhajit Paul, Sahil Kumawat, Ashutosh Gupta, Deepak Mishra
:
F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring. 9457-9467 - Ram J. Zaveri, Shivang Patel, Yu Gu, Gianfranco Doretto:
Improving Accuracy and Generalization for Efficient Visual Tracking. 9468-9478 - Jiwon Yoo, Dami Ko, Gyeonghwan Kim:
CCASeg: Decoding Multi-Scale Context with Convolutional Cross-Attention for Semantic Segmentation. 9479-9488 - Abbas Khan, Muhammad Asad, Martin Benning, Caroline H. Roney, Gregory G. Slabaugh:
Compositional Segmentation of Cardiac Images Leveraging Metadata. 9489-9498 - Rakesh Raj Madavan, Akshat Kaimal, Badhrinarayanan K. V, Vinayak Gupta, Rohit Choudhary, Chandrakala Shanmuganathan, Kaushik Mitra:
GANESH: Generalizable NeRF for Lensless Imaging. 9499-9508 - Hoonhee Cho, Jae-Young Kang, Taewoo Kim, Yuhwan Jeong, Kuk-Jin Yoon:
Unifying Low-Resolution and High-Resolution Alignment by Event Cameras for Space-Time Video Super-Resolution. 9509-9520 - Reza Ghoddoosian, Nakul Agarwal, Isht Dwivedi, Behzad Darisuh:
ACE: Action Concept Enhancement of Video-Language Models in Procedural Videos. 9521-9531 - Meng Ye, Bingyu Xin, Leon Axel, Dimitris N. Metaxas:
Continuous Spatio-Temporal Memory Networks for 4D Cardiac Cine MRI Segmentation. 9532-9542 - Badri N. Patro, Vinay P. Namboodiri, Vijay Srinivas Agneeswaran:
SpectFormer: Frequency and Attention is what you need in a Vision Transformer. 9543-9554 - Masatoshi Tateno, Takuma Yagi, Ryosuke Furuta, Yoichi Sato:
Learning Multiple Object States from Actions via Large Language Models. 9555-9565 - Yijie Hu, Guanyu Yang, Zhaorui Tan, Xiaowei Huang, Kaizhu Huang, Qiufeng Wang:
Covariance-Based Space Regularization for Few-Shot Class Incremental Learning. 9566-9576 - Jin-Cheng Jhang, Tao Tu, Fu-En Wang, Ke Zhang, Min Sun, Cheng-Hao Kuo:
V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations. 9577-9586 - Chen Xu
, Chunguo Li, Hongjie Xing:
Discriminative Score Suppression for Weakly Supervised Video Anomaly Detection. 9587-9596 - Amartya Roy Chowdhury, Raghuram Bharadwaj Diddigi
, Prabuchandran K. J., Achyut Mani Tripathi:
Bandit-based Attention Mechanism in Vision Transformers. 9597-9606 - Zhefan Rao, Tianjia Zhang, Yuen-Fui Lau, Qifeng Chen:
Robust Portrait Image Matting and Depth-of-field Synthesis via Multiplane Images. 9607-9617 - Md Raqib Khan, Anshul Negi, Ashutosh Kulkarni, Shruti S. Phutke, Santosh Kumar Vipparthi, Subrahmanyam Murala:
Phaseformer: Phase-Based Attention Mechanism for Underwater Image Restoration and Beyond. 9618-9629 - Vaibhav Vavilala, Faaris Shaik, David A. Forsyth:
Dequantization and Color Transfer with Diffusion Models. 9630-9639 - Mingchen Xu, Peter Herbert, Yu-Kun Lai, Ze Ji, Jing Wu:
RGB-D Video Mirror Detection. 9640-9649 - Liuyue Xie, Jiancong Guo, László A. Jeni, Zhiheng Jia, Mingyang Li, Yunwen Zhou, Chao Guo:
Through the Curved Cover: Synthesizing Cover Aberrated Scenes with Refractive Field. 9650-9659 - Min Jin Chong, Dejia Xu, Yi Zhang, Zhangyang Wang, David A. Forsyth, Gurunandan Krishnan, Yicheng Wu, Jian Wang:
Copy or Not? Reference-Based Face Image Restoration with Fine Details. 9660-9669 - A S. M. Iftekhar, Raphael Ruschel, Satish Kumar, Suya You, B. S. Manjunath:
DDS: Decoupled Dynamic Scene-Graph Generation Network. 9670-9680 - João P. K. Ferreira, João P. L. Pinto, Júlia S. Moura, Yi Li
, Cristiano Leite Castro, Plamen Angelov
:
Vision-Based Landing Guidance Through Tracking and Orientation Estimation. 9681-9689 - Abhisek Ray, Ayush Raj, Maheshkumar H. Kolekar:
Autoregressive Adaptive Hypergraph Transformer for Skeleton-Based Activity Recognition. 9690-9699 - Minje Kim, Minjun Kim, Xu Yang:
DTA: Dual Temporal-channel-wise Attention for Spiking Neural Networks. 9700-9710

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.