default search action
Hao Tan 0002
Person information
- affiliation: Adobe Research
- affiliation (former): University of North Carolina, Chapel Hill, NC, USA
Other persons with the same name
- Hao Tan — disambiguation page
- Hao Tan 0001 — Hunan University, State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Changsha, China
- Hao Tan 0003 — Southern University of Science and Technology, University Key Laboratory of Evolving Intelligent Systems of Guangdong Province, Shenzhen, China
- Hao Tan 0004 — Southwest University, Chongqing, China
- Hao Tan 0005 — National Engineering Laboratory for Deep Coal Construction Technology in Coal Mines, Beijing, China
- Hao Tan 0006 — University of Waterloo, Waterloo, ON, Canada
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c30]Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman:
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning. CVPR 2024: 6369-6379 - [c29]Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu:
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting. ECCV (22) 2024: 1-19 - [c28]Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang:
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction. ICLR 2024 - [c27]Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liangyan Gui, Tong Sun, Yu-Xiong Wang:
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation. ICLR 2024 - [c26]Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan:
LRM: Large Reconstruction Model for Single Image to 3D. ICLR 2024 - [c25]Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi:
Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model. ICLR 2024 - [c24]Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang:
DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model. ICLR 2024 - [i41]Zhenzhen Weng, Jingyuan Liu, Hao Tan, Zhan Xu, Yang Zhou, Serena Yeung-Levy, Jimei Yang:
Single-View 3D Human Digitalization with Large Reconstruction Models. CoRR abs/2401.12175 (2024) - [i40]Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, Zexiang Xu:
MeshLRM: Large Reconstruction Model for High-Quality Mesh. CoRR abs/2404.12385 (2024) - [i39]Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang:
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation. CoRR abs/2404.12386 (2024) - [i38]Yuan Zang, Tian Yun, Hao Tan, Trung Bui, Chen Sun:
Pre-trained Vision-Language Models Learn Discoverable Visual Concepts. CoRR abs/2404.12652 (2024) - [i37]Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu:
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting. CoRR abs/2404.19702 (2024) - [i36]Desai Xie, Sai Bi, Zhixin Shu, Kai Zhang, Zexiang Xu, Yi Zhou, Sören Pirk, Arie E. Kaufman, Xin Sun, Hao Tan:
LRM-Zero: Training Large Reconstruction Models with Synthesized Data. CoRR abs/2406.09371 (2024) - [i35]Minh Nguyen, Franck Dernoncourt, Seunghyun Yoon, Hanieh Deilamsalehy, Hao Tan, Ryan A. Rossi, Quan Hung Tran, Trung Bui, Thien Huu Nguyen:
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models. CoRR abs/2407.12094 (2024) - [i34]Tianyuan Zhang, Zhengfei Kuang, Haian Jin, Zexiang Xu, Sai Bi, Hao Tan, He Zhang, Yiwei Hu, Milos Hasan, William T. Freeman, Kai Zhang, Fujun Luan:
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models. CoRR abs/2410.06231 (2024) - [i33]Desai Xie, Zhan Xu, Yicong Hong, Hao Tan, Difan Liu, Feng Liu, Arie E. Kaufman, Yang Zhou:
Progressive Autoregressive Video Diffusion Models. CoRR abs/2410.08151 (2024) - [i32]Ziwen Chen, Hao Tan, Kai Zhang, Sai Bi, Fujun Luan, Yicong Hong, Fuxin Li, Zexiang Xu:
Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats. CoRR abs/2410.12781 (2024) - [i31]Haian Jin, Hanwen Jiang, Hao Tan, Kai Zhang, Sai Bi, Tianyuan Zhang, Fujun Luan, Noah Snavely, Zexiang Xu:
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias. CoRR abs/2410.17242 (2024) - [i30]Gene Chou, Kai Zhang, Sai Bi, Hao Tan, Zexiang Xu, Fujun Luan, Bharath Hariharan, Noah Snavely:
Generating 3D-Consistent Videos from Unposed Internet Photos. CoRR abs/2411.13549 (2024) - [i29]Zhengfei Kuang, Tianyuan Zhang, Kai Zhang, Hao Tan, Sai Bi, Yiwei Hu, Zexiang Xu, Milos Hasan, Gordon Wetzstein, Fujun Luan:
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors. CoRR abs/2411.17249 (2024) - 2023
- [c23]Yicong Hong, Yang Zhou, Ruiyi Zhang, Franck Dernoncourt, Trung Bui, Stephen Gould, Hao Tan:
Learning Navigational Visual Representations with Semantic Map Supervision. ICCV 2023: 3032-3044 - [c22]Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao:
Scaling Data Generation in Vision-and-Language Navigation. ICCV 2023: 11975-11986 - [c21]Viet Dac Lai, Abel Salinas, Hao Tan, Trung Bui, Quan Tran, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Thien Huu Nguyen:
Boosting Punctuation Restoration with Data Generation and Reinforcement Learning. INTERSPEECH 2023: 2133-2137 - [i28]Yicong Hong, Yang Zhou, Ruiyi Zhang, Franck Dernoncourt, Trung Bui, Stephen Gould, Hao Tan:
Learning Navigational Visual Representations with Semantic Map Supervision. CoRR abs/2307.12335 (2023) - [i27]Viet Dac Lai, Abel Salinas, Hao Tan, Trung Bui, Quan Hung Tran, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Thien Huu Nguyen:
Boosting Punctuation Restoration with Data Generation and Reinforcement Learning. CoRR abs/2307.12949 (2023) - [i26]Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao:
Scaling Data Generation in Vision-and-Language Navigation. CoRR abs/2307.15644 (2023) - [i25]Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan:
LRM: Large Reconstruction Model for Single Image to 3D. CoRR abs/2311.04400 (2023) - [i24]Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi:
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model. CoRR abs/2311.06214 (2023) - [i23]Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang:
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model. CoRR abs/2311.09217 (2023) - [i22]Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang:
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction. CoRR abs/2311.12024 (2023) - [i21]Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman:
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning. CoRR abs/2312.13980 (2023) - 2022
- [c20]Hao Tan, Chen-Tse Tsai, Yujie He, Mohit Bansal:
Scientific Chart Summarization: Datasets and Improved Text Modeling. SDU@AAAI 2022 - [c19]Jialu Li, Hao Tan, Mohit Bansal:
Envedit: Environment Editing for Vision-and-Language Navigation. CVPR 2022: 15386-15396 - [c18]Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer:
How Much Can CLIP Benefit Vision-and-Language Tasks? ICLR 2022 - [c17]Jialu Li, Hao Tan, Mohit Bansal:
CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations. NAACL-HLT (Findings) 2022: 633-649 - [i20]Jialu Li, Hao Tan, Mohit Bansal:
EnvEdit: Environment Editing for Vision-and-Language Navigation. CoRR abs/2203.15685 (2022) - [i19]Jialu Li, Hao Tan, Mohit Bansal:
CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations. CoRR abs/2207.02185 (2022) - 2021
- [c16]Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal:
Unifying Vision-and-Language Tasks via Text Generation. ICML 2021: 1931-1942 - [c15]Jialu Li, Hao Tan, Mohit Bansal:
Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information. NAACL-HLT 2021: 1041-1050 - [c14]Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal:
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer. NeurIPS 2021: 24468-24481 - [i18]Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal:
Unifying Vision-and-Language Tasks via Text Generation. CoRR abs/2102.02779 (2021) - [i17]Jialu Li, Hao Tan, Mohit Bansal:
Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information. CoRR abs/2104.09580 (2021) - [i16]Hao Tan, Jie Lei, Thomas Wolf, Mohit Bansal:
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning. CoRR abs/2106.11250 (2021) - [i15]Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal:
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer. CoRR abs/2107.02681 (2021) - [i14]Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer:
How Much Can CLIP Benefit Vision-and-Language Tasks? CoRR abs/2107.06383 (2021) - 2020
- [c13]Hyounghun Kim, Hao Tan, Mohit Bansal:
Modality-Balanced Models for Visual Dialogue. AAAI 2020: 8091-8098 - [c12]Qinxin Wang, Hao Tan, Sheng Shen, Michael W. Mahoney, Zhewei Yao:
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding. EMNLP (1) 2020: 2030-2038 - [c11]Hao Tan, Mohit Bansal:
Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision. EMNLP (1) 2020: 2066-2080 - [c10]Hyounghun Kim, Abhaysinh Zala, Graham Burri, Hao Tan, Mohit Bansal:
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments. EMNLP (Findings) 2020: 3910-3927 - [c9]Xiang Zhou, Yixin Nie, Hao Tan, Mohit Bansal:
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions. EMNLP (1) 2020: 8215-8228 - [c8]Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz:
Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning. ICRA 2020: 1963-1969 - [c7]Yubo Zhang, Hao Tan, Mohit Bansal:
Diagnosing the Environment Bias in Vision-and-Language Navigation. IJCAI 2020: 890-897 - [i13]Hyounghun Kim, Hao Tan, Mohit Bansal:
Modality-Balanced Models for Visual Dialogue. CoRR abs/2001.06354 (2020) - [i12]Xiang Zhou, Yixin Nie, Hao Tan, Mohit Bansal:
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions. CoRR abs/2004.13606 (2020) - [i11]Yubo Zhang, Hao Tan, Mohit Bansal:
Diagnosing the Environment Bias in Vision-and-Language Navigation. CoRR abs/2005.03086 (2020) - [i10]Qinxin Wang, Hao Tan, Sheng Shen, Michael W. Mahoney, Zhewei Yao:
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding. CoRR abs/2010.05379 (2020) - [i9]Hao Tan, Mohit Bansal:
Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision. CoRR abs/2010.06775 (2020) - [i8]Hyounghun Kim, Abhay Zala, Graham Burri, Hao Tan, Mohit Bansal:
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments. CoRR abs/2011.07660 (2020)
2010 – 2019
- 2019
- [c6]Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal:
Expressing Visual Relationships via Language. ACL (1) 2019: 1873-1883 - [c5]Hao Tan, Mohit Bansal:
LXMERT: Learning Cross-Modality Encoder Representations from Transformers. EMNLP/IJCNLP (1) 2019: 5099-5110 - [c4]Hao Tan, Licheng Yu, Mohit Bansal:
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout. NAACL-HLT (1) 2019: 2610-2621 - [i7]Hao Tan, Licheng Yu, Mohit Bansal:
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout. CoRR abs/1904.04195 (2019) - [i6]Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz:
Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning. CoRR abs/1904.12907 (2019) - [i5]Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal:
Expressing Visual Relationships via Language. CoRR abs/1906.07689 (2019) - [i4]Hao Tan, Mohit Bansal:
LXMERT: Learning Cross-Modality Encoder Representations from Transformers. CoRR abs/1908.07490 (2019) - 2018
- [c3]Hao Tan, Mohit Bansal:
Source-Target Inference Models for Spatial Instruction Understanding. AAAI 2018: 5504-5511 - [c2]Hao Tan, Mohit Bansal:
Object Ordering with Bidirectional Matchings for Visual Reasoning. NAACL-HLT (2) 2018: 444-451 - [i3]Hao Tan, Mohit Bansal:
Object Ordering with Bidirectional Matchings for Visual Reasoning. CoRR abs/1804.06870 (2018) - 2017
- [c1]Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg:
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions. CVPR 2017: 3521-3529 - [i2]Hao Tan, Mohit Bansal:
Source-Target Inference Models for Spatial Instruction Understanding. CoRR abs/1707.03804 (2017) - 2016
- [i1]Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg:
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions. CoRR abs/1612.09542 (2016)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-14 22:16 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint