Abstract is missing.
- GS-LRM: Large Reconstruction Model for 3D Gaussian SplattingKai Zhang 0045, Sai Bi, Hao Tan 0002, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu. 1-19 [doi]
- Robust-Wide: Robust Watermarking Against Instruction-Driven Image EditingRunyi Hu, Jie Zhang 0073, Ting Xu, Jiwei Li 0001, Tianwei Zhang 0004. 20-37 [doi]
- OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts RemovalQiao Mo, Yukang Ding, Jinhua Hao, Qiang Zhu, Ming Sun 0008, Chao Zhou 0003, Feiyu Chen 0001, Shuyuan Zhu. 38-56 [doi]
- Formula-Supervised Visual-Geometric Pre-trainingRyosuke Yamada, Kensho Hara, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh. 57-74 [doi]
- [inline-graphic not available: see fulltext]VideoAgent: A Memory-Augmented Multimodal Agent for Video UnderstandingYue Fan, Xiaojian Ma, Rujie Wu, Yuntao Du 0001, Jiaqi Li, Zhi Gao, Qing Li 0003. 75-92 [doi]
- Towards Unified Representation of Invariant-Specific Features in Missing Modality Face Anti-spoofingGuanghao Zheng, Yuchen Liu 0006, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong. 93-110 [doi]
- Restoring Images in Adverse Weather Conditions via Histogram TransformerShangquan Sun, Wenqi Ren, Xinwei Gao, Rui Wang 0032, Xiaochun Cao. 111-129 [doi]
- PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest TransformerTongkun Guan, Chengyu Lin 0002, Wei Shen 0002, Xiaokang Yang. 130-147 [doi]
- NGP-RT: Fusing Multi-level Hash Features with Lightweight Attention for Real-Time Novel View SynthesisYubin Hu 0001, Xiaoyang Guo, Yang Xiao, Jingwei Huang 0001, Yong-Jin Liu. 148-165 [doi]
- Elysium: Exploring Object-Level Perception in Videos via MLLMHan Wang, Yongjie Ye, Yanjie Wang, Yuxiang Nie, Can Huang. 166-185 [doi]
- 2fR: Frequency Regularization in Grid-Based Feature Encoding Neural Radiance FieldsShuxiang Xie, Shuyi Zhou, Ken Sakurada, Ryoichi Ishikawa, Masaki Onishi, Takeshi Oishi. 186-203 [doi]
- Getting it Right: Improving Spatial Consistency in Text-to-Image ModelsAgneet Chatterjee, Gabriela Ben Melech Stan, Estelle Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hannaneh Hajishirzi, Vasudev Lal, Chitta Baral, Yezhou Yang. 204-222 [doi]
- Generating 3D House Wireframes with SemanticsXueqi Ma, Yilin Liu, Wenjun Zhou, Ruowei Wang, Hui Huang 0004. 223-240 [doi]
- GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single ImageXiao Fu, Wei Yin 0006, Mu Hu, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin, Xiaoxiao Long. 241-258 [doi]
- Shape-Guided Configuration-Aware Learning for Endoscopic-Image-Based Pose Estimation of Flexible Robotic InstrumentsYiyao Ma, Kai Chen 0028, Hon-Sing Tong, Ruofeng Wei, Yui-Lun Ng, Ka Wai Kwok, Qi Dou 0001. 259-276 [doi]
- Nonverbal Interaction DetectionJianan Wei, Tianfei Zhou, Yi Yang 0001, Wenguan Wang. 277-295 [doi]
- 2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous DrivingJian Zou, Tianyu Huang, Guanglei Yang, Zhenhua Guo 0001, Tao Luo 0014, Chun-Mei Feng, Wangmeng Zuo. 296-313 [doi]
- Responsible Visual EditingMinheng Ni, Yeli Shen, Lei Zhang 0006, Wangmeng Zuo. 314-330 [doi]
- DragAnything: Motion Control for Anything Using Entity RepresentationWeijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao 0001, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang. 331-348 [doi]
- [inline-graphic not available: see fulltext] SegPoint: Segment Any Point Cloud via Large Language ModelShuting He, Henghui Ding, Xudong Jiang 0001, Bihan Wen. 349-367 [doi]
- Navigation Instruction Generation with BEV Perception and Large Language ModelsSheng Fan, Rui Liu, Wenguan Wang, Yi Yang 0001. 368-387 [doi]
- Rebalancing Using Estimated Class Distribution for Imbalanced Semi-supervised Learning Under Class Distribution MismatchTaemin Park, Hyuck Lee, Heeyoung Kim. 388-404 [doi]
- Vista3D: Unravel the 3D Darkside of a Single ImageQiuhong Shen, Xingyi Yang, Michael Bi Mi, Xinchao Wang. 405-421 [doi]
- The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt InterpretationYi Yao, Chan-Feng Hsu, Jhe-Hao Lin, Hongxia Xie, Terence Lin, Yi-Ning Huang, Hong-Han Shuai, Wen-Huang Cheng. 422-438 [doi]
- Detecting as Labeling: Rethinking LiDAR-Camera Fusion in 3D Object DetectionJunjie Huang 0005, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du. 439-455 [doi]
- FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved OptimallyQiuhong Shen, Xingyi Yang, Xinchao Wang. 456-472 [doi]
- Exploiting Dual-Correlation for Multi-frame Time-of-Flight DenoisingGuanting Dong, Yueyi Zhang, Xiaoyan Sun 0001, Zhiwei Xiong. 473-489 [doi]