default search action
Csaba Szepesvári
Person information
- affiliation: University of Alberta
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c214]David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári:
Exploration via linearly perturbed loss minimisation. AISTATS 2024: 721-729 - [c213]Jihao Andreas Lin, Shreyas Padhy, Javier Antorán, Austin Tripp, Alexander Terenin, Csaba Szepesvári, José Miguel Hernández-Lobato, David Janz:
Stochastic Gradient Descent for Gaussian Processes Done Right. ICLR 2024 - [c212]Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári:
Switching the Loss Reduces the Cost in Batch Reinforcement Learning. ICML 2024 - [i141]Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvári, Dale Schuurmans:
Stochastic Gradient Succeeds for Bandits. CoRR abs/2402.17235 (2024) - [i140]Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári:
Switching the Loss Reduces the Cost in Batch Reinforcement Learning. CoRR abs/2403.05385 (2024) - [i139]Johannes Kirschner, Seyed Alireza Bakhtiari, Kushagra Chandak, Volodymyr Tkachuk, Csaba Szepesvári:
Regret Minimization via Saddle Point Optimization. CoRR abs/2403.10379 (2024) - [i138]Yasin Abbasi-Yadkori, Ilja Kuzborskij, David Stutz, András György, Adam Fisch, Arnaud Doucet, Iuliya Beloshapka, Wei-Hung Weng, Yao-Yuan Yang, Csaba Szepesvári, Ali Taylan Cemgil, Nenad Tomasev:
Mitigating LLM Hallucinations via Conformal Abstention. CoRR abs/2405.01563 (2024) - [i137]Volodymyr Tkachuk, Gellért Weisz, Csaba Szepesvári:
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear qπ-Realizability and Concentrability. CoRR abs/2405.16809 (2024) - [i136]Yasin Abbasi-Yadkori, Ilja Kuzborskij, András György, Csaba Szepesvári:
To Believe or Not to Believe Your LLM. CoRR abs/2406.02543 (2024) - [i135]Tian Tian, Lin F. Yang, Csaba Szepesvári:
Confident Natural Policy Gradient for Local Planning in qπ-realizable Constrained MDPs. CoRR abs/2406.18529 (2024) - [i134]Shuai Liu, Alex Ayoub, Flore Sentenac, Xiaoqi Tan, Csaba Szepesvári:
Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits. CoRR abs/2410.01112 (2024) - 2023
- [c211]Volodymyr Tkachuk, Seyed Alireza Bakhtiari, Johannes Kirschner, Matej Jusup, Ilija Bogunovic, Csaba Szepesvári:
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning. AISTATS 2023: 6342-6370 - [c210]Sihan Liu, Gaurav Mahajan, Daniel Kane, Shachar Lovett, Gellért Weisz, Csaba Szepesvári:
Exponential Hardness of Reinforcement Learning with Linear Function Approximation. COLT 2023: 1588-1617 - [c209]Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvári, Zhaoran Wang:
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics. ICLR 2023 - [c208]Philip Amortila, Nan Jiang, Csaba Szepesvári:
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation. ICML 2023: 768-790 - [c207]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c206]Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvári, Dale Schuurmans:
Stochastic Gradient Succeeds for Bandits. ICML 2023: 24325-24360 - [c205]Yao Zhao, Connor Stephens, Csaba Szepesvári, Kwang-Sung Jun:
Revisiting Simple Regret: Fast Rates for Returning a Good Arm. ICML 2023: 42110-42158 - [c204]Johannes Kirschner, Seyed Alireza Bakhtiari, Kushagra Chandak, Volodymyr Tkachuk, Csaba Szepesvári:
Regret Minimization via Saddle Point Optimization. NeurIPS 2023 - [c203]Chung-Wei Lee, Qinghua Liu, Yasin Abbasi-Yadkori, Chi Jin, Tor Lattimore, Csaba Szepesvári:
Context-lumpable stochastic bandits. NeurIPS 2023 - [c202]Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári:
Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL. NeurIPS 2023 - [c201]Jincheng Mei, Bo Dai, Alekh Agarwal, Mohammad Ghavamzadeh, Csaba Szepesvári, Dale Schuurmans:
Ordering-based Conditions for Global Convergence of Policy Gradient Methods. NeurIPS 2023 - [c200]Gellért Weisz, András György, Csaba Szepesvári:
Online RL in Linearly qπ-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore. NeurIPS 2023 - [c199]Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin:
Optimistic MLE: A Generic Model-Based Algorithm for Partially Observable Sequential Decision Making. STOC 2023: 363-376 - [i133]Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
The Role of Baselines in Policy Gradient Optimization. CoRR abs/2301.06276 (2023) - [i132]Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao, Csaba Szepesvári:
Sample Efficient Deep Reinforcement Learning via Local Planning. CoRR abs/2301.12579 (2023) - [i131]Volodymyr Tkachuk, Seyed Alireza Bakhtiari, Johannes Kirschner, Matej Jusup, Ilija Bogunovic, Csaba Szepesvári:
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2302.04376 (2023) - [i130]Daniel Kane, Sihan Liu, Shachar Lovett, Gaurav Mahajan, Csaba Szepesvári, Gellért Weisz:
Exponential Hardness of Reinforcement Learning with Linear Function Approximation. CoRR abs/2302.12940 (2023) - [i129]Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári:
Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL. CoRR abs/2305.11032 (2023) - [i128]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i127]Chung-Wei Lee, Qinghua Liu, Yasin Abbasi-Yadkori, Chi Jin, Tor Lattimore, Csaba Szepesvári:
Context-lumpable stochastic bandits. CoRR abs/2306.13053 (2023) - [i126]Philip Amortila, Nan Jiang, Csaba Szepesvári:
The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation. CoRR abs/2307.13332 (2023) - [i125]Gellért Weisz, András György, Csaba Szepesvári:
Online RL in Linearly qπ-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore. CoRR abs/2310.07811 (2023) - [i124]Jihao Andreas Lin, Shreyas Padhy, Javier Antorán, Austin Tripp, Alexander Terenin, Csaba Szepesvári, José Miguel Hernández-Lobato, David Janz:
Stochastic Gradient Descent for Gaussian Processes Done Right. CoRR abs/2310.20581 (2023) - [i123]David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári:
Exploration via linearly perturbed loss minimisation. CoRR abs/2311.07565 (2023) - [i122]David Janz, Alexander E. Litvak, Csaba Szepesvári:
Ensemble sampling for linear bandits: small ensembles suffice. CoRR abs/2311.08376 (2023) - 2022
- [c198]Botao Hao, Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvári:
Confident Least Square Value Iteration with Local Access to a Simulator. AISTATS 2022: 2420-2435 - [c197]Anant Raj, Pooria Joulani, András György, Csaba Szepesvári:
Faster Rates, Adaptive Algorithms, and Finite-Time Bounds for Linear Composition Optimization and Gradient TD Learning. AISTATS 2022: 7176-7186 - [c196]Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvári:
The Curse of Passive Data Collection in Batch Reinforcement Learning. AISTATS 2022: 8413-8438 - [c195]Gellért Weisz, Csaba Szepesvári, András György:
TensorPlan and the Few Actions Lower Bound for Planning in MDPs under Linear Realizability of Optimal Value Functions. ALT 2022: 1097-1137 - [c194]Dong Yin, Botao Hao, Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári:
Efficient local planning with linear function approximation. ALT 2022: 1165-1192 - [c193]Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin:
When Is Partially Observable Reinforcement Learning Not Scary? COLT 2022: 5175-5220 - [c192]Qinghua Liu, Csaba Szepesvári, Chi Jin:
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games. NeurIPS 2022 - [c191]Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
The Role of Baselines in Policy Gradient Optimization. NeurIPS 2022 - [c190]Sharan Vaswani, Lin Yang, Csaba Szepesvári:
Near-Optimal Sample Complexity Bounds for Constrained MDPs. NeurIPS 2022 - [c189]Gellért Weisz, András György, Tadashi Kozuno, Csaba Szepesvári:
Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs. NeurIPS 2022 - [c188]Hui Yuan, Chengzhuo Ni, Huazheng Wang, Xuezhou Zhang, Le Cong, Csaba Szepesvári, Mengdi Wang:
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization. NeurIPS 2022 - [c187]Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai:
A free lunch from the noise: Provable and practical exploration for representation learning. UAI 2022: 1686-1696 - [e4]Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, Sivan Sabato:
International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research 162, PMLR 2022 [contents] - [i121]Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvári, Doina Precup:
Towards Painless Policy Optimization for Constrained MDPs. CoRR abs/2204.05176 (2022) - [i120]Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin:
When Is Partially Observable Reinforcement Learning Not Scary? CoRR abs/2204.08967 (2022) - [i119]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i118]Qinghua Liu, Csaba Szepesvári, Chi Jin:
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games. CoRR abs/2206.01315 (2022) - [i117]Hui Yuan, Chengzhuo Ni, Huazheng Wang, Xuezhou Zhang, Le Cong, Csaba Szepesvári, Mengdi Wang:
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization. CoRR abs/2206.02092 (2022) - [i116]Sharan Vaswani, Lin F. Yang, Csaba Szepesvári:
Near-Optimal Sample Complexity Bounds for Constrained MDPs. CoRR abs/2206.06270 (2022) - [i115]Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin:
Optimistic MLE - A Generic Model-based Algorithm for Partially Observable Sequential Decision Making. CoRR abs/2209.14997 (2022) - [i114]Gellért Weisz, András György, Tadashi Kozuno, Csaba Szepesvári:
Confident Approximate Policy Iteration for Efficient Local Planning in qπ-realizable MDPs. CoRR abs/2210.15755 (2022) - [i113]Yao Zhao, Connor Stephens, Csaba Szepesvári, Kwang-Sung Jun:
Revisiting Simple Regret Minimization in Multi-Armed Bandits. CoRR abs/2210.16913 (2022) - [i112]Ilja Kuzborskij, Csaba Szepesvári:
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks. CoRR abs/2212.13848 (2022) - 2021
- [j45]María Pérez-Ortiz, Omar Rivasplata, John Shawe-Taylor, Csaba Szepesvári:
Tighter Risk Certificates for Neural Networks. J. Mach. Learn. Res. 22: 227:1-227:40 (2021) - [j44]Yuxi Li, Alborz Geramifard, Lihong Li, Csaba Szepesvári, Tao Wang:
Guest editorial: special issue on reinforcement learning for real life. Mach. Learn. 110(9): 2291-2293 (2021) - [c186]Botao Hao, Tor Lattimore, Csaba Szepesvári, Mengdi Wang:
Online Sparse Reinforcement Learning. AISTATS 2021: 316-324 - [c185]Botao Hao, Nevena Lazic, Yasin Abbasi-Yadkori, Pooria Joulani, Csaba Szepesvári:
Adaptive Approximate Policy Iteration. AISTATS 2021: 523-531 - [c184]Ilja Kuzborskij, Claire Vernade, András György, Csaba Szepesvári:
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting. AISTATS 2021: 640-648 - [c183]Gellért Weisz, Philip Amortila, Csaba Szepesvári:
Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions. ALT 2021: 1237-1264 - [c182]Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvári:
Asymptotically Optimal Information-Directed Sampling. COLT 2021: 2777-2821 - [c181]Ilja Kuzborskij, Csaba Szepesvári:
Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping. COLT 2021: 2853-2890 - [c180]Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári:
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function. COLT 2021: 4355-4385 - [c179]Dongruo Zhou, Quanquan Gu, Csaba Szepesvári:
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes. COLT 2021: 4532-4576 - [c178]Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvári, Mengdi Wang:
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient. ICML 2021: 4063-4073 - [c177]Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang:
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference. ICML 2021: 4074-4084 - [c176]Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvári:
A Distribution-dependent Analysis of Meta Learning. ICML 2021: 5697-5706 - [c175]Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvári:
Meta-Thompson Sampling. ICML 2021: 5884-5893 - [c174]Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvári:
Improved Regret Bound and Experience Replay in Regularized Policy Iteration. ICML 2021: 6032-6042 - [c173]Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
Leveraging Non-uniformity in First-order Non-convex Optimization. ICML 2021: 7555-7564 - [c172]Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
On the Optimality of Batch Policy Optimization Algorithms. ICML 2021: 11362-11371 - [c171]Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvári, Mengdi Wang:
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method. NeurIPS 2021: 2228-2240 - [c170]Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
Understanding the Effect of Stochasticity in Policy Optimization. NeurIPS 2021: 19339-19351 - [c169]Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári:
No Regrets for Learning the Prior in Bandits. NeurIPS 2021: 28029-28041 - [c168]Ilja Kuzborskij, Csaba Szepesvári, Omar Rivasplata, Amal Rannen-Triki, Razvan Pascanu:
On the Role of Optimization in Double Descent: A Least Squares Study. NeurIPS 2021: 29567-29577 - [i111]Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári:
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function. CoRR abs/2102.02049 (2021) - [i110]Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang:
Bootstrapping Statistical Inference for Off-Policy Evaluation. CoRR abs/2102.03607 (2021) - [i109]Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvári:
Meta-Thompson Sampling. CoRR abs/2102.06129 (2021) - [i108]Nevena Lazic, Botao Hao, Yasin Abbasi-Yadkori, Dale Schuurmans, Csaba Szepesvári:
Optimization Issues in KL-Constrained Approximate Policy Iteration. CoRR abs/2102.06234 (2021) - [i107]Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvári, Mengdi Wang:
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method. CoRR abs/2102.08607 (2021) - [i106]Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvári:
Improved Regret Bound and Experience Replay in Regularized Policy Iteration. CoRR abs/2102.12611 (2021) - [i105]Chenjun Xiao, Yifan Wu, Tor Lattimore, Bo Dai, Jincheng Mei, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
On the Optimality of Batch Policy Optimization Algorithms. CoRR abs/2104.02293 (2021) - [i104]Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvári, Dale Schuurmans:
Leveraging Non-uniformity in First-order Non-convex Optimization. CoRR abs/2105.06072 (2021) - [i103]Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, András György, Csaba Szepesvári, Raia Hadsell, Nicolas Heess, Martin A. Riedmiller:
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning. CoRR abs/2106.08199 (2021) - [i102]Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvári:
On the Sample Complexity of Batch Reinforcement Learning with Policy-Induced Data. CoRR abs/2106.09973 (2021) - [i101]Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári:
No Regrets for Learning the Prior in Bandits. CoRR abs/2107.06196 (2021) - [i100]Ilja Kuzborskij, Csaba Szepesvári, Omar Rivasplata, Amal Rannen-Triki, Razvan Pascanu:
On the Role of Optimization in Double Descent: A Least Squares Study. CoRR abs/2107.12685 (2021) - [i99]Dong Yin, Botao Hao, Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári:
Efficient Local Planning with Linear Function Approximation. CoRR abs/2108.05533 (2021) - [i98]Gellért Weisz, Csaba Szepesvári, András György:
TensorPlan and the Few Actions Lower Bound for Planning in MDPs under Linear Realizability of Optimal Value Functions. CoRR abs/2110.02195 (2021) - [i97]Han Zhong, Zhuoran Yang, Zhaoran Wang, Csaba Szepesvári:
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs. CoRR abs/2110.08984 (2021) - [i96]Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
Understanding the Effect of Stochasticity in Policy Optimization. CoRR abs/2110.15572 (2021) - [i95]Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai:
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning. CoRR abs/2111.11485 (2021) - 2020
- [j43]Karl Tuyls, Julien Pérolat, Marc Lanctot, Edward Hughes, Richard Everett, Joel Z. Leibo, Csaba Szepesvári, Thore Graepel:
Bounds and dynamics for empirical game theoretic analysis. Auton. Agents Multi Agent Syst. 34(1): 7 (2020) - [j42]Yao Ma, Alex Olshevsky, Csaba Szepesvári, Venkatesh Saligrama:
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers. J. Mach. Learn. Res. 21: 133:1-133:36 (2020) - [j41]Pooria Joulani, András György, Csaba Szepesvári:
A modular analysis of adaptive (non-)convex optimization: Optimism, composite objectives, variance reduction, and variational bounds. Theor. Comput. Sci. 808: 108-138 (2020) - [c167]Branislav Kveton, Manzil Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier:
Randomized Exploration in Generalized Linear Bandits. AISTATS 2020: 2066-2076 - [c166]Botao Hao, Tor Lattimore, Csaba Szepesvári:
Adaptive Exploration in Linear Contextual Bandit. AISTATS 2020: 3536-3545 - [c165]Tor Lattimore, Csaba Szepesvári:
Exploration by Optimisation in Partial Monitoring. COLT 2020: 2488-2515 - [c164]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [c163]Alex Ayoub, Zeyu Jia, Csaba Szepesvári, Mengdi Wang, Lin Yang:
Model-Based Reinforcement Learning with Value-Targeted Regression. ICML 2020: 463-474 - [c162]Pooria Joulani, Anant Raj, András György, Csaba Szepesvári:
A simpler approach to accelerated optimization: iterative averaging meets optimism. ICML 2020: 4984-4993 - [c161]Tor Lattimore, Csaba Szepesvári, Gellért Weisz:
Learning with Good Feature Representations in Bandits and in RL with a Generative Model. ICML 2020: 5662-5670 - [c160]Jincheng Mei, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
On the Global Convergence Rates of Softmax Policy Gradient Methods. ICML 2020: 6820-6829 - [c159]Zeyu Jia, Lin Yang, Csaba Szepesvári, Mengdi Wang:
Model-Based Reinforcement Learning with Value-Targeted Regression. L4DC 2020: 666-686 - [c158]Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvári, Manzil Zaheer:
Differentiable Meta-Learning of Bandit Policies. NeurIPS 2020 - [c157]Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
CoinDICE: Off-Policy Confidence Interval Estimation. NeurIPS 2020 - [c156]Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
Escaping the Gravitational Pull of Softmax. NeurIPS 2020 - [c155]Aldo Pacchiano, My Phan, Yasin Abbasi-Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvári:
Model Selection in Contextual Stochastic Bandit Problems. NeurIPS 2020 - [c154]Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvári, John Shawe-Taylor:
PAC-Bayes Analysis Beyond the Usual Bounds. NeurIPS 2020 - [c153]Roshan Shariff, Csaba Szepesvári:
Efficient Planning in Large MDPs with Weak Linear Function Approximation. NeurIPS 2020 - [c152]Arun Verma, Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Online Algorithm for Unsupervised Sequential Selection with Contextual Information. NeurIPS 2020 - [c151]Gellért Weisz, András György, Wei-I Lin, Devon R. Graham, Kevin Leyton-Brown, Csaba Szepesvári, Brendan Lucier:
ImpatientCapsAndRuns: Approximately Optimal Algorithm Configuration from an Infinite Pool. NeurIPS 2020 - [c150]Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvári, Mengdi Wang:
Variational Policy Gradient Method for Reinforcement Learning with General Utilities. NeurIPS 2020 - [i94]Botao Hao, Nevena Lazic, Yasin Abbasi-Yadkori, Pooria Joulani, Csaba Szepesvári:
Provably Efficient Adaptive Approximate Policy Iteration. CoRR abs/2002.03069 (2020) - [i93]Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvári, Manzil Zaheer:
Differentiable Bandit Exploration. CoRR abs/2002.06772 (2020) - [i92]Aldo Pacchiano, My Phan, Yasin Abbasi-Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvári:
Model Selection in Contextual Stochastic Bandit Problems. CoRR abs/2003.01704 (2020) - [i91]Jincheng Mei, Chenjun Xiao, Csaba Szepesvári, Dale Schuurmans:
On the Global Convergence Rates of Softmax Policy Gradient Methods. CoRR abs/2005.06392 (2020) - [i90]Alex Ayoub, Zeyu Jia, Csaba Szepesvári, Mengdi Wang, Lin F. Yang:
Model-Based Reinforcement Learning with Value-Targeted Regression. CoRR abs/2006.01107 (2020) - [i89]Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvári, Craig Boutilier:
Differentiable Meta-Learning in Contextual Bandits. CoRR abs/2006.05094 (2020) - [i88]Ilja Kuzborskij, Claire Vernade, András György, Csaba Szepesvári:
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting. CoRR abs/2006.10460 (2020) - [i87]Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvári, John Shawe-Taylor:
PAC-Bayes Analysis Beyond the Usual Bounds. CoRR abs/2006.13057 (2020) - [i86]Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvári, Mengdi Wang:
Variational Policy Gradient Method for Reinforcement Learning with General Utilities. CoRR abs/2007.02151 (2020) - [i85]Roshan Shariff, Csaba Szepesvári:
Efficient Planning in Large MDPs with Weak Linear Function Approximation. CoRR abs/2007.06184 (2020) - [i84]María Pérez-Ortiz, Omar Rivasplata, John Shawe-Taylor, Csaba Szepesvári:
Tighter risk certificates for neural networks. CoRR abs/2007.12911 (2020) - [i83]Gellért Weisz, Philip Amortila, Csaba Szepesvári:
Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions. CoRR abs/2010.01374 (2020) - [i82]Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
CoinDICE: Off-Policy Confidence Interval Estimation. CoRR abs/2010.11652 (2020) - [i81]Arun Verma, Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Online Algorithm for Unsupervised Sequential Selection with Contextual Information. CoRR abs/2010.12353 (2020) - [i80]Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvári:
On Optimality of Meta-Learning in Fixed-Design Regression with Weighted Biased Regularization. CoRR abs/2011.00344 (2020) - [i79]Botao Hao, Tor Lattimore, Csaba Szepesvári, Mengdi Wang:
Online Sparse Reinforcement Learning. CoRR abs/2011.04018 (2020) - [i78]Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvári, Mengdi Wang:
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient. CoRR abs/2011.04019 (2020) - [i77]Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvári:
Asymptotically Optimal Information-Directed Sampling. CoRR abs/2011.05944 (2020) - [i76]Dongruo Zhou, Quanquan Gu, Csaba Szepesvári:
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes. CoRR abs/2012.08507 (2020)
2010 – 2019
- 2019
- [c149]Karim T. Abou-Moustafa, Csaba Szepesvári:
An Exponential Tail Bound for the Deleted Estimate. AAAI 2019: 3143-3150 - [c148]Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári:
Model-Free Linear Quadratic Control via Reduction to Expert Prediction. AISTATS 2019: 3108-3117 - [c147]Arun Verma, Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Online Algorithm for Unsupervised Sensor Selection. AISTATS 2019: 3168-3176 - [c146]Karim T. Abou-Moustafa, Csaba Szepesvári:
An Exponential Efron-Stein Inequality for Lq Stable Learning Rules. ALT 2019: 31-63 - [c145]Tor Lattimore, Csaba Szepesvári:
Cleaning up the neighborhood: A full classification for adversarial partial monitoring. ALT 2019: 529-556 - [c144]Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári:
Distribution-Dependent Analysis of Gibbs-ERM Principle. COLT 2019: 2028-2054 - [c143]Tor Lattimore, Csaba Szepesvári:
An Information-Theoretic Approach to Minimax Regret in Partial Monitoring. COLT 2019: 2111-2139 - [c142]Jonathan Uesato, Ananya Kumar, Csaba Szepesvári, Tom Erez, Avraham Ruderman, Keith Anderson, Krishnamurthy (Dj) Dvijotham, Nicolas Heess, Pushmeet Kohli:
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures. ICLR (Poster) 2019 - [c141]Branislav Kveton, Csaba Szepesvári, Sharan Vaswani, Zheng Wen, Tor Lattimore, Mohammad Ghavamzadeh:
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits. ICML 2019: 3601-3610 - [c140]Yasin Abbasi-Yadkori, Peter L. Bartlett, Kush Bhatia, Nevena Lazic, Csaba Szepesvári, Gellért Weisz:
POLITEX: Regret Bounds for Policy Iteration using Expert Prediction. ICML 2019: 3692-3702 - [c139]Shuai Li, Tor Lattimore, Csaba Szepesvári:
Online Learning to Rank with Features. ICML 2019: 3856-3865 - [c138]Gellért Weisz, András György, Csaba Szepesvári:
CapsAndRuns: An Improved Method for Approximately Optimal Algorithm Configuration. ICML 2019: 6707-6715 - [c137]Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier:
Perturbed-History Exploration in Stochastic Multi-Armed Bandits. IJCAI 2019: 2786-2793 - [c136]Roman Werpachowski, András György, Csaba Szepesvári:
Detecting Overfitting via Adversarial Examples. NeurIPS 2019: 7856-7866 - [c135]Pooria Joulani, András György, Csaba Szepesvári:
Think out of the "Box": Generically-Constrained Asynchronous Composite Optimization and Hedging. NeurIPS 2019: 12225-12235 - [c134]Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvári, Masrour Zoghi:
BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback. UAI 2019: 196-206 - [c133]Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier:
Perturbed-History Exploration in Stochastic Linear Bandits. UAI 2019: 530-540 - [i75]Arun Verma, Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Online Algorithm for Unsupervised Sensor Selection. CoRR abs/1901.04676 (2019) - [i74]Tor Lattimore, Csaba Szepesvári:
An Information-Theoretic Approach to Minimax Regret in Partial Monitoring. CoRR abs/1902.00470 (2019) - [i73]Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári:
Distribution-Dependent Analysis of Gibbs-ERM Principle. CoRR abs/1902.01846 (2019) - [i72]Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier:
Perturbed-History Exploration in Stochastic Multi-Armed Bandits. CoRR abs/1902.10089 (2019) - [i71]Roman Werpachowski, András György, Csaba Szepesvári:
Detecting Overfitting via Adversarial Examples. CoRR abs/1903.02380 (2019) - [i70]Karim T. Abou-Moustafa, Csaba Szepesvári:
An Exponential Efron-Stein Inequality for Lq Stable Learning Rules. CoRR abs/1903.05457 (2019) - [i69]Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier:
Perturbed-History Exploration in Stochastic Linear Bandits. CoRR abs/1903.09132 (2019) - [i68]Chih-Wei Hsu, Branislav Kveton, Ofer Meshi, Martin Mladenov, Csaba Szepesvári:
Empirical Bayes Regret Minimization. CoRR abs/1904.02664 (2019) - [i67]Yao Ma, Alex Olshevsky, Venkatesh Saligrama, Csaba Szepesvári:
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers. CoRR abs/1904.11608 (2019) - [i66]Branislav Kveton, Manzil Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier:
Randomized Exploration in Generalized Linear Bandits. CoRR abs/1906.08947 (2019) - [i65]Tor Lattimore, Csaba Szepesvári:
Exploration by Optimisation in Partial Monitoring. CoRR abs/1907.05772 (2019) - [i64]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. CoRR abs/1908.03568 (2019) - [i63]Omar Rivasplata, Vikram M. Tankasali, Csaba Szepesvári:
PAC-Bayes with Backprop. CoRR abs/1908.07380 (2019) - [i62]Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári, Gellért Weisz:
Exploration-Enhanced POLITEX. CoRR abs/1908.10479 (2019) - [i61]Ilja Kuzborskij, Csaba Szepesvári:
Efron-Stein PAC-Bayesian Inequalities. CoRR abs/1909.01931 (2019) - [i60]Botao Hao, Tor Lattimore, Csaba Szepesvári:
Adaptive Exploration in Linear Contextual Bandit. CoRR abs/1910.06996 (2019) - [i59]Pratik Gajane, Ronald Ortner, Peter Auer, Csaba Szepesvári:
Autonomous exploration for navigating in non-stationary CMPs. CoRR abs/1910.08446 (2019) - [i58]Tor Lattimore, Csaba Szepesvári:
Learning with Good Feature Representations in Bandits and in RL with a Generative Model. CoRR abs/1911.07676 (2019) - 2018
- [j40]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar, Csaba Szepesvári:
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. IEEE Trans. Autom. Control. 63(4): 1185-1191 (2018) - [j39]Cheng Jie, Prashanth L. A., Michael C. Fu, Steven I. Marcus, Csaba Szepesvári:
Stochastic Optimization in a Cumulative Prospect Theory Framework. IEEE Trans. Autom. Control. 63(9): 2867-2882 (2018) - [c132]Chandrashekar Lakshminarayanan, Csaba Szepesvári:
Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go? AISTATS 2018: 1347-1355 - [c131]Yao Ma, Alexander Olshevsky, Csaba Szepesvári, Venkatesh Saligrama:
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers. ICML 2018: 3341-3350 - [c130]Ciara Pike-Burke, Shipra Agrawal, Csaba Szepesvári, Steffen Grünewälder:
Bandits with Delayed, Aggregated Anonymous Feedback. ICML 2018: 4102-4110 - [c129]Gellért Weisz, András György, Csaba Szepesvári:
LEAPSANDBOUNDS: A Method for Approximately Optimal Algorithm Configuration. ICML 2018: 5254-5262 - [c128]Karim T. Abou-Moustafa, Csaba Szepesvári:
An Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation. ISAIM 2018 - [c127]Tor Lattimore, Branislav Kveton, Shuai Li, Csaba Szepesvári:
TopRank: A practical algorithm for online stochastic ranking. NeurIPS 2018: 3949-3958 - [c126]Omar Rivasplata, Csaba Szepesvári, John Shawe-Taylor, Emilio Parrado-Hernández, Shiliang Sun:
PAC-Bayes bounds for stable algorithms with instance-dependent priors. NeurIPS 2018: 9234-9244 - [i57]Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvári:
Regret Bounds for Model-Free Linear Quadratic Control. CoRR abs/1804.06021 (2018) - [i56]Tor Lattimore, Csaba Szepesvári:
Cleaning up the neighborhood: A full classification for adversarial partial monitoring. CoRR abs/1805.09247 (2018) - [i55]Tor Lattimore, Branislav Kveton, Shuai Li, Csaba Szepesvári:
TopRank: A practical algorithm for online stochastic ranking. CoRR abs/1806.02248 (2018) - [i54]Branislav Kveton, Chang Li, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvári, Masrour Zoghi:
BubbleRank: Safe Online Learning to Rerank. CoRR abs/1806.05819 (2018) - [i53]Omar Rivasplata, Emilio Parrado-Hernández, John Shawe-Taylor, Shiliang Sun, Csaba Szepesvári:
PAC-Bayes bounds for stable algorithms with instance-dependent priors. CoRR abs/1806.06827 (2018) - [i52]Gellért Weisz, András György, Csaba Szepesvári:
LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration. CoRR abs/1807.00755 (2018) - [i51]Shuai Li, Tor Lattimore, Csaba Szepesvári:
Online Learning to Rank with Features. CoRR abs/1810.02567 (2018) - [i50]Branislav Kveton, Csaba Szepesvári, Zheng Wen, Mohammad Ghavamzadeh, Tor Lattimore:
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits. CoRR abs/1811.05154 (2018) - [i49]Jonathan Uesato, Ananya Kumar, Csaba Szepesvári, Tom Erez, Avraham Ruderman, Keith Anderson, Krishnamurthy Dvijotham, Nicolas Heess, Pushmeet Kohli:
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures. CoRR abs/1812.01647 (2018) - 2017
- [j38]Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári:
Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities. J. Mach. Learn. Res. 18: 145:1-145:31 (2017) - [c125]Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen:
Stochastic Rank-1 Bandits. AISTATS 2017: 392-401 - [c124]Tor Lattimore, Csaba Szepesvári:
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits. AISTATS 2017: 728-737 - [c123]Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Unsupervised Sequential Sensor Acquisition. AISTATS 2017: 803-811 - [c122]Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller:
Structured Best Arm Identification with Fixed Confidence. ALT 2017: 593-616 - [c121]Pooria Joulani, András György, Csaba Szepesvári:
A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds. ALT 2017: 681-720 - [c120]Masrour Zoghi, Tomás Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvári, Zheng Wen:
Online Learning to Rank in Stochastic Click Models. ICML 2017: 4199-4208 - [c119]Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen:
Bernoulli Rank-1 Bandits for Click Feedback. IJCAI 2017: 2001-2007 - [c118]Mahdi Karami, Martha White, Dale Schuurmans, Csaba Szepesvári:
Multi-view Matrix Factorization for Linear Dynamical System Estimation. NIPS 2017: 7092-7101 - [i48]Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári:
Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities. CoRR abs/1702.03040 (2017) - [i47]Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvári, Tomás Tunys, Zheng Wen, Masrour Zoghi:
Online Learning to Rank in Stochastic Click Models. CoRR abs/1703.02527 (2017) - [i46]Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen:
Bernoulli Rank-1 Bandits for Click Feedback. CoRR abs/1703.06513 (2017) - [i45]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar, Csaba Szepesvári:
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. CoRR abs/1704.02544 (2017) - [i44]Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller:
Structured Best Arm Identification with Fixed Confidence. CoRR abs/1706.05198 (2017) - [i43]Karim T. Abou-Moustafa, Csaba Szepesvári:
An a Priori Exponential Tail Bound for k-Folds Cross-Validation. CoRR abs/1706.05801 (2017) - [i42]Yao Ma, Alex Olshevsky, Venkatesh Saligrama, Csaba Szepesvári:
Crowdsourcing with Sparsely Interacting Workers. CoRR abs/1706.06660 (2017) - [i41]Daniel J. Hsu, Aryeh Kontorovich, David A. Levin, Yuval Peres, Csaba Szepesvári:
Mixing time estimation in reversible Markov chains from a single sample path. CoRR abs/1708.07367 (2017) - [i40]Pooria Joulani, András György, Csaba Szepesvári:
A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds. CoRR abs/1709.02726 (2017) - [i39]Chandrashekar Lakshminarayanan, Csaba Szepesvári:
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging. CoRR abs/1709.04073 (2017) - [i38]Ciara Pike-Burke, Shipra Agrawal, Csaba Szepesvári, Steffen Grünewälder:
Bandits with Delayed Anonymous Feedback. CoRR abs/1709.06853 (2017) - [i37]Branislav Kveton, Csaba Szepesvári, Anup Rao, Zheng Wen, Yasin Abbasi-Yadkori, S. Muthukrishnan:
Stochastic Low-Rank Bandits. CoRR abs/1712.04644 (2017) - 2016
- [j37]Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor:
Regularized Policy Iteration with Nonparametric Function Spaces. J. Mach. Learn. Res. 17: 139:1-139:66 (2016) - [c117]Pooria Joulani, András György, Csaba Szepesvári:
Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms. AAAI 2016: 1744-1750 - [c116]Guy Lever, John Shawe-Taylor, Ronnie Stafford, Csaba Szepesvári:
Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning. AAAI 2016: 1779-1787 - [c115]Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári:
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles. AISTATS 2016: 819-828 - [c114]Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Zheng Wen:
DCM Bandits: Learning to Rank with Multiple Clicks. ICML 2016: 1215-1224 - [c113]Yifan Wu, Roshan Shariff, Tor Lattimore, Csaba Szepesvári:
Conservative Bandits. ICML 2016: 1254-1262 - [c112]Prashanth L. A., Cheng Jie, Michael C. Fu, Steven I. Marcus, Csaba Szepesvári:
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control. ICML 2016: 1406-1415 - [c111]András György, Csaba Szepesvári:
Shifting Regret, Mirror Descent, and Matrices. ICML 2016: 2943-2951 - [c110]Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári:
Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities. NIPS 2016: 4970-4978 - [c109]Kiarash Shaloudegi, András György, Csaba Szepesvári, Wilsun Xu:
SDP Relaxation with Randomized Rounding for Energy Disaggregation. NIPS 2016: 4979-4987 - [i36]Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Zheng Wen:
DCM Bandits: Learning to Rank with Multiple Clicks. CoRR abs/1602.03146 (2016) - [i35]Yifan Wu, Roshan Shariff, Tor Lattimore, Csaba Szepesvári:
Conservative Bandits. CoRR abs/1602.04282 (2016) - [i34]Bernardo Ávila Pires, Csaba Szepesvári:
Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models. CoRR abs/1602.06346 (2016) - [i33]Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen:
Stochastic Rank-1 Bandits. CoRR abs/1608.03023 (2016) - [i32]Gábor Balázs, András György, Csaba Szepesvári:
Chaining Bounds for Empirical Risk Minimization. CoRR abs/1609.01872 (2016) - [i31]Gábor Balázs, András György, Csaba Szepesvári:
Max-affine estimators for convex stochastic programming. CoRR abs/1609.06331 (2016) - [i30]Bernardo Ávila Pires, Csaba Szepesvári:
Multiclass Classification Calibration Functions. CoRR abs/1609.06385 (2016) - [i29]Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári:
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles. CoRR abs/1609.07087 (2016) - [i28]Tor Lattimore, Csaba Szepesvári:
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits. CoRR abs/1610.04491 (2016) - [i27]Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama:
Sequential Learning without Feedback. CoRR abs/1610.05394 (2016) - [i26]Kiarash Shaloudegi, András György, Csaba Szepesvári, Wilsun Xu:
SDP Relaxation with Randomized Rounding for Energy Disaggregation. CoRR abs/1610.09491 (2016) - 2015
- [c108]Nolan Bard, Deon Nicholas, Csaba Szepesvári, Michael Bowling:
Decision-Theoretic Clustering of Strategies. AAAI Workshop: Computer Poker and Imperfect Information 2015 - [c107]Bernardo Ávila Pires, Csaba Szepesvári:
Pathological Effects of Variance on Classification-Based Policy Iteration. AAAI Workshop: Learning for General Competency in Video Games 2015 - [c106]Gábor Balázs, András György, Csaba Szepesvári:
Near-optimal max-affine estimators for convex regression. AISTATS 2015 - [c105]Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári:
Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits. AISTATS 2015 - [c104]Lihong Li, Rémi Munos, Csaba Szepesvári:
Toward Minimax Off-policy Value Estimation. AISTATS 2015 - [c103]Roshan Shariff, András György, Csaba Szepesvári:
Exploiting Symmetries to Construct Efficient MCMC Algorithms With an Application to SLAM. AISTATS 2015 - [c102]Nolan Bard, Deon Nicholas, Csaba Szepesvári, Michael H. Bowling:
Decision-theoretic Clustering of Strategies. AAMAS 2015: 17-25 - [c101]Branislav Kveton, Csaba Szepesvári, Zheng Wen, Azin Ashkan:
Cascading Bandits: Learning to Rank in the Cascade Model. ICML 2015: 767-776 - [c100]Yifan Wu, András György, Csaba Szepesvári:
On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments. ICML 2015: 1283-1291 - [c99]Ruitong Huang, András György, Csaba Szepesvári:
Deterministic Independent Component Analysis. ICML 2015: 2521-2530 - [c98]Pooria Joulani, András György, Csaba Szepesvári:
Fast Cross-Validation for Incremental Learning. IJCAI 2015: 3597-3604 - [c97]Tor Lattimore, Koby Crammer, Csaba Szepesvári:
Linear Multi-Resource Allocation with Semi-Bandit Feedback. NIPS 2015: 964-972 - [c96]Yifan Wu, András György, Csaba Szepesvári:
Online Learning with Gaussian Payoffs and Side Observations. NIPS 2015: 1360-1368 - [c95]Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári:
Combinatorial Cascading Bandits. NIPS 2015: 1450-1458 - [c94]Daniel J. Hsu, Aryeh Kontorovich, Csaba Szepesvári:
Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path. NIPS 2015: 1459-1467 - [c93]Yasin Abbasi-Yadkori, Csaba Szepesvári:
Bayesian Optimal Control of Smoothly Parameterized Systems. UAI 2015: 1-11 - [i25]Branislav Kveton, Csaba Szepesvári, Zheng Wen, Azin Ashkan:
Cascading Bandits. CoRR abs/1502.02763 (2015) - [i24]Daniel J. Hsu, Aryeh Kontorovich, Csaba Szepesvári:
Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path. CoRR abs/1506.02903 (2015) - [i23]Pooria Joulani, András György, Csaba Szepesvári:
Fast Cross-Validation for Incremental Learning. CoRR abs/1507.00066 (2015) - [i22]Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári:
Combinatorial Cascading Bandits. CoRR abs/1507.04208 (2015) - [i21]Yifan Wu, András György, Csaba Szepesvári:
Online Learning with Gaussian Payoffs and Side Observations. CoRR abs/1510.08108 (2015) - [i20]Ruitong Huang, Bing Xu, Dale Schuurmans, Csaba Szepesvári:
Learning with a Strong Adversary. CoRR abs/1511.03034 (2015) - 2014
- [j36]Gábor Bartók, Dean P. Foster, Dávid Pál, Alexander Rakhlin, Csaba Szepesvári:
Partial Monitoring - Classification, Regret Bounds, and Algorithms. Math. Oper. Res. 39(4): 967-997 (2014) - [j35]Gergely Neu, András György, Csaba Szepesvári, András Antos:
Online Markov Decision Processes Under Bandit Feedback. IEEE Trans. Autom. Control. 59(3): 676-691 (2014) - [j34]Jyrki Kivinen, Csaba Szepesvári, Thomas Zeugmann:
Guest Editors' introduction. Theor. Comput. Sci. 519: 1-3 (2014) - [j33]Thanh Le, Csaba Szepesvári, Rong Zheng:
Sequential Learning for Multi-Channel Wireless Network Monitoring With Channel Switching Costs. IEEE Trans. Signal Process. 62(22): 5919-5929 (2014) - [c92]Hengshuai Yao, Csaba Szepesvári, Bernardo Ávila Pires, Xinhua Zhang:
Pseudo-MDPs and factored linear action models. ADPRL 2014: 1-9 - [c91]Ruitong Huang, Csaba Szepesvári:
A Finite-Sample Generalization Bound for Semiparametric Regression: Partially Linear Models. AISTATS 2014: 402-410 - [c90]Tor Lattimore, András György, Csaba Szepesvári:
On Learning the Optimal Waiting Time. ALT 2014: 200-214 - [c89]Travis Dick, András György, Csaba Szepesvári:
Online Learning in Markov Decision Processes with Changing Cost Sequences. ICML 2014: 512-520 - [c88]James Neufeld, András György, Csaba Szepesvári, Dale Schuurmans:
Adaptive Monte Carlo via Bandit Allocation. ICML 2014: 1944-1952 - [c87]Ruitong Huang, Csaba Szepesvári:
Generalization Bounds for Partially Linear Models. ISAIM 2014 - [c86]Hengshuai Yao, Csaba Szepesvári, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar:
Universal Option Models. NIPS 2014: 990-998 - [c85]Tor Lattimore, Koby Crammer, Csaba Szepesvári:
Optimal Resource Allocation with Semi-Bandit Feedback. UAI 2014: 477-486 - [e3]Maria-Florina Balcan, Vitaly Feldman, Csaba Szepesvári:
Proceedings of The 27th Conference on Learning Theory, COLT 2014, Barcelona, Spain, June 13-15, 2014. JMLR Workshop and Conference Proceedings 35, JMLR.org 2014 [contents] - [i19]James Neufeld, András György, Dale Schuurmans, Csaba Szepesvári:
Adaptive Monte Carlo via Bandit Allocation. CoRR abs/1405.3318 (2014) - [i18]Tor Lattimore, Koby Crammer, Csaba Szepesvári:
Optimal Resource Allocation with Semi-Bandit Feedback. CoRR abs/1406.3840 (2014) - [i17]Yasin Abbasi-Yadkori, Csaba Szepesvári:
Bayesian Optimal Control of Smoothly Parameterized Systems: The Lazy Posterior Sampling Algorithm. CoRR abs/1406.3926 (2014) - [i16]Lihong Li, Rémi Munos, Csaba Szepesvári:
On Minimax Optimal Offline Policy Evaluation. CoRR abs/1409.3653 (2014) - [i15]Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári:
Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits. CoRR abs/1410.0949 (2014) - 2013
- [j32]Arash Afkanpour, Csaba Szepesvári, Michael Bowling:
Alignment based kernel learning with a continuous set of base kernels. Mach. Learn. 91(3): 305-324 (2013) - [j31]András Antos, Gábor Bartók, Dávid Pál, Csaba Szepesvári:
Toward a classification of finite partial-monitoring games. Theor. Comput. Sci. 473: 77-99 (2013) - [c84]Arash Afkanpour, András György, Csaba Szepesvári, Michael Bowling:
A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning. ICML (1) 2013: 374-382 - [c83]Yaoliang Yu, Hao Cheng, Dale Schuurmans, Csaba Szepesvári:
Characterizing the Representer Theorem. ICML (1) 2013: 570-578 - [c82]Bernardo Ávila Pires, Csaba Szepesvári, Mohammad Ghavamzadeh:
Cost-sensitive Multiclass Classification Risk Bounds. ICML (3) 2013: 1391-1399 - [c81]Pooria Joulani, András György, Csaba Szepesvári:
Online Learning under Delayed Feedback. ICML (3) 2013: 1453-1461 - [c80]Navid Zolghadr, Gábor Bartók, Russell Greiner, András György, Csaba Szepesvári:
Online Learning with Costly Features and Labels. NIPS 2013: 1241-1249 - [c79]Yasin Abbasi-Yadkori, Peter L. Bartlett, Varun Kanade, Yevgeny Seldin, Csaba Szepesvári:
Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions. NIPS 2013: 2508-2516 - [i14]Yasin Abbasi-Yadkori, Peter L. Bartlett, Csaba Szepesvári:
Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions. CoRR abs/1303.3055 (2013) - [i13]Pooria Joulani, András György, Csaba Szepesvári:
Online Learning under Delayed Feedback. CoRR abs/1306.0686 (2013) - 2012
- [j30]Sylvain Gelly, Levente Kocsis, Marc Schoenauer, Michèle Sebag, David Silver, Csaba Szepesvári, Olivier Teytaud:
The grand challenge of computer Go: Monte Carlo tree search and extensions. Commun. ACM 55(3): 106-113 (2012) - [c78]Hengshuai Yao, Csaba Szepesvári:
Approximate Policy Iteration with Linear Action Models. AAAI 2012: 1212-1218 - [c77]Gábor Bartók, Csaba Szepesvári:
Partial Monitoring with Side Information. ALT 2012: 305-319 - [c76]Marc Peter Deisenroth, Csaba Szepesvári, Jan Peters:
Preface. EWRL 2012 - [c75]Yevgeny Seldin, Csaba Szepesvári, Peter Auer, Yasin Abbasi-Yadkori:
Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments. EWRL 2012: 103-116 - [c74]Gábor Bartók, Navid Zolghadr, Csaba Szepesvári:
An adaptive algorithm for finite stochastic partial monitoring. ICML 2012 - [c73]Bernardo Ávila Pires, Csaba Szepesvári:
Statistical linear estimation with penalized estimators: an application to reinforcement learning. ICML 2012 - [c72]Yaoliang Yu, Csaba Szepesvári:
Analysis of Kernel Mean Matching under Covariate Shift. ICML 2012 - [c71]Ryan Kiros, Csaba Szepesvári:
Deep Representations and Codes for Image Auto-Annotation. NIPS 2012: 917-925 - [c70]Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári:
Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits. AISTATS 2012: 1-9 - [c69]Gergely Neu, András György, Csaba Szepesvári:
The adversarial stochastic shortest path problem with unknown transition probabilities. AISTATS 2012: 805-813 - [e2]Marc Peter Deisenroth, Csaba Szepesvári, Jan Peters:
Proceedings of the Tenth European Workshop on Reinforcement Learning, EWRL 2012, Edinburgh, Scotland, UK, June, 2012. JMLR Proceedings 24, JMLR.org 2012 [contents] - [i12]Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvári:
PAC-Bayesian Policy Evaluation for Reinforcement Learning. CoRR abs/1202.3717 (2012) - [i11]Arash Afkanpour, András György, Csaba Szepesvári, Michael H. Bowling:
A Randomized Strategy for Learning to Combine Many Features. CoRR abs/1205.0288 (2012) - [i10]Alejandro Isaza, Csaba Szepesvári, Vadim Bulitko, Russell Greiner:
Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstractions. CoRR abs/1206.3233 (2012) - [i9]Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael Bowling:
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. CoRR abs/1206.3285 (2012) - [i8]Yaoliang Yu, Csaba Szepesvári:
Analysis of Kernel Mean Matching under Covariate Shift. CoRR abs/1206.4650 (2012) - [i7]Gergely Neu, Csaba Szepesvári:
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods. CoRR abs/1206.5264 (2012) - 2011
- [j29]Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári:
X-Armed Bandits. J. Mach. Learn. Res. 12: 1655-1695 (2011) - [j28]Amir Massoud Farahmand, Csaba Szepesvári:
Model selection in reinforcement learning. Mach. Learn. 85(3): 299-332 (2011) - [c68]Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann:
Editors' Introduction. ALT 2011: 1-13 - [c67]Csaba Szepesvári:
Invited Talk: Towards Robust Reinforcement Learning Algorithms. EWRL 2011: 4 - [c66]Pallavi Arora, Csaba Szepesvári, Rong Zheng:
Sequential learning for optimal monitoring of multi-channel wireless networks. INFOCOM 2011: 1152-1160 - [c65]Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári:
Improved Algorithms for Linear Stochastic Bandits. NIPS 2011: 2312-2320 - [c64]Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvári:
PAC-Bayesian Policy Evaluation for Reinforcement Learning. UAI 2011: 195-202 - [c63]Yasin Abbasi-Yadkori, Csaba Szepesvári:
Regret Bounds for the Adaptive Control of Linear Quadratic Systems. COLT 2011: 1-26 - [c62]Gábor Bartók, Dávid Pál, Csaba Szepesvári:
Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments. COLT 2011: 133-154 - [c61]István Szita, Csaba Szepesvári:
Agnostic KWIK learning and efficient approximate reinforcement learning. COLT 2011: 739-772 - [e1]Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann:
Algorithmic Learning Theory - 22nd International Conference, ALT 2011, Espoo, Finland, October 5-7, 2011. Proceedings. Lecture Notes in Computer Science 6925, Springer 2011, ISBN 978-3-642-24411-7 [contents] - [i6]András Antos, Gábor Bartók, Dávid Pál, Csaba Szepesvári:
Toward a Classification of Finite Partial-Monitoring Games. CoRR abs/1102.2041 (2011) - [i5]Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári:
Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems. CoRR abs/1102.2670 (2011) - [i4]András Antos, Gábor Bartók, Csaba Szepesvári:
Non-trivial two-armed partial-monitoring games are bandits. CoRR abs/1108.4961 (2011) - [i3]Arash Afkanpour, Csaba Szepesvári, Michael H. Bowling:
Alignment Based Kernel Learning with a Continuous Set of Base Kernels. CoRR abs/1112.4607 (2011) - 2010
- [b1]Csaba Szepesvári:
Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers 2010, ISBN 978-3-031-00423-0 - [j27]Gábor Bartók, Csaba Szepesvári, Sandra Zilles:
Models of active learning in group-structured state spaces. Inf. Comput. 208(4): 364-384 (2010) - [j26]András Antos, Varun Grover, Csaba Szepesvári:
Active learning in heteroscedastic noise. Theor. Comput. Sci. 411(29-30): 2712-2728 (2010) - [c60]Gábor Bartók, Dávid Pál, Csaba Szepesvári:
Toward a Classification of Finite Partial-Monitoring Games. ALT 2010: 224-238 - [c59]Gergely Neu, András György, Csaba Szepesvári:
The Online Loop-free Stochastic Shortest-Path Problem. COLT 2010: 231-243 - [c58]Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard S. Sutton:
Toward Off-Policy Learning Control with Function Approximation. ICML 2010: 719-726 - [c57]Liuyang Li, Barnabás Póczos, Csaba Szepesvári, Russell Greiner:
Budgeted Distribution Learning of Belief Net Parameters. ICML 2010: 879-886 - [c56]Istvan Szita, Csaba Szepesvári:
Model-based reinforcement learning with nearly tight exploration complexity bounds. ICML 2010: 1031-1038 - [c55]Yasin Abbasi-Yadkori, Joseph Modayil, Csaba Szepesvári:
Extending rapidly-exploring random trees for asymptotically optimal anytime motion planning. IROS 2010: 127-132 - [c54]Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvári:
Error Propagation for Approximate Policy and Value Iteration. NIPS 2010: 568-576 - [c53]Sarah Filippi, Olivier Cappé, Aurélien Garivier, Csaba Szepesvári:
Parametric Bandits: The Generalized Linear Case. NIPS 2010: 586-594 - [c52]Gergely Neu, András György, Csaba Szepesvári, András Antos:
Online Markov Decision Processes under Bandit Feedback. NIPS 2010: 1804-1812 - [c51]Dávid Pál, Barnabás Póczos, Csaba Szepesvári:
Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs. NIPS 2010: 1849-1857 - [c50]Barnabás Póczos, Sergey Kirshner, Csaba Szepesvári:
REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization. AISTATS 2010: 605-612 - [c49]Péter Torma, András György, Csaba Szepesvári:
A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping. AISTATS 2010: 852-859 - [i2]Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári:
X-Armed Bandits. CoRR abs/1001.4475 (2010) - [i1]Dávid Pál, Barnabás Póczos, Csaba Szepesvári:
Estimation of Rényi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs. CoRR abs/1003.1954 (2010)
2000 – 2009
- 2009
- [j25]Gergely Neu, Csaba Szepesvári:
Training parsers by inverse reinforcement learning. Mach. Learn. 77(2-3): 303-337 (2009) - [j24]Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári:
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19): 1876-1902 (2009) - [c48]Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor:
Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems. ACC 2009: 725-730 - [c47]Hengshuai Yao, Shalabh Bhatnagar, Csaba Szepesvári:
LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS. CDC 2009: 1181-1188 - [c46]Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári:
Workshop summary: On-line learning with limited feedback. ICML 2009: 8 - [c45]Alireza Farhangfar, Russell Greiner, Csaba Szepesvári:
Learning to segment from a few well-selected training images. ICML 2009: 305-312 - [c44]Barnabás Póczos, Yasin Abbasi-Yadkori, Csaba Szepesvári, Russell Greiner, Nathan R. Sturtevant:
Learning when to stop thinking and do something! ICML 2009: 825-832 - [c43]Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora:
Fast gradient-descent methods for temporal-difference learning with linear function approximation. ICML 2009: 993-1000 - [c42]Amir Massoud Farahmand, Azad Shademan, Martin Jägersand, Csaba Szepesvári:
Model-based and model-free reinforcement learning for visual servoing. ICRA 2009: 2917-2924 - [c41]Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton:
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009: 1204-1212 - [c40]Hengshuai Yao, Richard S. Sutton, Shalabh Bhatnagar, Diao Dongcui, Csaba Szepesvári:
Multi-Step Dyna Planning for Policy Evaluation and Control. NIPS 2009: 2187-2195 - [c39]Yaoliang Yu, Yuxi Li, Dale Schuurmans, Csaba Szepesvári:
A General Projection Property for Distribution Families. NIPS 2009: 2232-2240 - [c38]Yuxi Li, Csaba Szepesvári, Dale Schuurmans:
Learning Exercise Policies for American Options. AISTATS 2009: 352-359 - 2008
- [j23]Rémi Munos, Csaba Szepesvári:
Finite-Time Bounds for Fitted Value Iteration. J. Mach. Learn. Res. 9: 815-857 (2008) - [j22]András Antos, Csaba Szepesvári, Rémi Munos:
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Mach. Learn. 71(1): 89-129 (2008) - [c37]András Antos, Varun Grover, Csaba Szepesvári:
Active Learning in Multi-armed Bandits. ALT 2008: 287-302 - [c36]Gábor Bartók, Csaba Szepesvári, Sandra Zilles:
Active Learning of Group-Structured Environments. ALT 2008: 329-343 - [c35]Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor:
Regularized Fitted Q-Iteration: Application to Planning. EWRL 2008: 55-68 - [c34]Volodymyr Mnih, Csaba Szepesvári, Jean-Yves Audibert:
Empirical Bernstein stopping. ICML 2008: 672-679 - [c33]Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári:
Online Optimization in X-Armed Bandits. NIPS 2008: 201-208 - [c32]Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor:
Regularized Policy Iteration. NIPS 2008: 441-448 - [c31]Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei:
A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. NIPS 2008: 1609-1616 - [c30]Alejandro Isaza, Csaba Szepesvári, Vadim Bulitko, Russell Greiner:
Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction. UAI 2008: 306-314 - [c29]Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling:
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. UAI 2008: 528-536 - 2007
- [c28]Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári:
Tuning Bandit Algorithms in Stochastic Environments. ALT 2007: 150-165 - [c27]Peter Auer, Ronald Ortner, Csaba Szepesvári:
Improved Rates for the Stochastic Continuum-Armed Bandit Problem. COLT 2007: 454-468 - [c26]Amir Massoud Farahmand, Csaba Szepesvári, Jean-Yves Audibert:
Manifold-adaptive dimension estimation. ICML 2007: 265-272 - [c25]András György, Levente Kocsis, Ivett Szabó, Csaba Szepesvári:
Continuous Time Associative Bandit Problems. IJCAI 2007: 830-835 - [c24]István Bíró, Zoltán Szamonek, Csaba Szepesvári:
Sequence Prediction Exploiting Similary Information. IJCAI 2007: 1576-1581 - [c23]András Antos, Rémi Munos, Csaba Szepesvári:
Fitted Q-iteration in continuous action-space MDPs. NIPS 2007: 9-16 - [c22]Gergely Neu, Csaba Szepesvári:
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods. UAI 2007: 295-302 - 2006
- [j21]Péter Torma, Csaba Szepesvári:
Local Importance Sampling: A Novel Technique to Enhance Particle Filtering. J. Multim. 1(1): 32-43 (2006) - [j20]Levente Kocsis, Csaba Szepesvári:
Universal parameter optimisation in games based on SPSA. Mach. Learn. 63(3): 249-286 (2006) - [c21]Levente Kocsis, Csaba Szepesvári, Mark H. M. Winands:
RSPSA: Enhanced Parameter Optimization in Games. ACG 2006: 39-56 - [c20]András Antos, Csaba Szepesvári, Rémi Munos:
Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. COLT 2006: 574-588 - [c19]Levente Kocsis, Csaba Szepesvári:
Bandit Based Monte-Carlo Planning. ECML 2006: 282-293 - 2005
- [c18]László Gerencsér, Miklós Rásonyi, Csaba Szepesvári, Zsuzsanna Vágó:
Log-optimal currency portfolios and control Lyapunov exponents. CDC/ECC 2005: 1764-1769 - [c17]Zoltán Szamonek, Csaba Szepesvári:
X-mHMM: An Efficient Algorithm for Training Mixtures of HMMs When the Number of Mixtures Is Unknown. ICDM 2005: 434-441 - [c16]Csaba Szepesvári, Rémi Munos:
Finite time bounds for sampling based fitted value iteration. ICML 2005: 880-887 - 2004
- [c15]Csaba Szepesvári:
Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results. AAAI 2004: 550-555 - [c14]Csaba Szepesvári, András Kocsor, Kornél Kovács:
Kernel Machine Based Feature Extraction Algorithms for Regression Problems. ECAI 2004: 1091-1092 - [c13]Péter Torma, Csaba Szepesvári:
Enhancing Particle Filters Using Local Likelihood Sampling. ECCV (1) 2004: 16-27 - [c12]András Kocsor, Kornél Kovács, Csaba Szepesvári:
Margin Maximizing Discriminant Analysis. ECML 2004: 227-238 - [c11]Csaba Szepesvári, William D. Smart:
Interpolation-based Q-learning. ICML 2004 - 2003
- [c10]Péter Torma, Csaba Szepesvári:
Sequential Importance Sampling for Visual Tracking Reconsidered. AISTATS 2003: 284-291 - 2002
- [j19]Mark French, Csaba Szepesvári, Eric Rogers:
LQ performance bounds for adaptive output feedback controllers for functionally uncertain nonlinear systems. Autom. 38(4): 683-693 (2002) - [j18]Mark French, Csaba Szepesvári, Eric Rogers:
An Asymptotic Scaling Analysis of LQ Performance for an Approximate Adaptive Control Design. Math. Control. Signals Syst. 15(2): 145-176 (2002) - 2001
- [j17]Csaba Szepesvári:
Efficient approximate planning in continuous space Markovian Decision Problems. AI Commun. 14(3): 163-176 (2001) - [j16]András Lörincz, György Hévízi, Csaba Szepesvári:
Ockham's Razor Modeling of the Matrisome Channels of the Basal Ganglia Thalamocortical Loops. Int. J. Neural Syst. 11(2): 125-143 (2001) - 2000
- [j15]Zsolt Kalmár, Csaba Szepesvári, András Lörincz:
Modular Reinforcement Learning: A Case Study in a Robot Domain. Acta Cybern. 14(3): 507-522 (2000) - [j14]Satinder Singh, Tommi S. Jaakkola, Michael L. Littman, Csaba Szepesvári:
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms. Mach. Learn. 38(3): 287-308 (2000) - [j13]Mark French, Csaba Szepesvári, Eric Rogers:
Uncertainty, performance, and model dependency in approximate adaptive nonlinear control. IEEE Trans. Autom. Control. 45(2): 353-358 (2000) - [c9]György Balogh, Ervin Dobler, Tamás Gröbler, Béla Smodics, Csaba Szepesvári:
FlexVoice: A Parametric Approach to High-Quality Speech Synthesis. TSD 2000: 189-194
1990 – 1999
- 1999
- [j12]János Murvai, Kristian Vlahovicek, Endre Barta, Csaba Szepesvári, Cristina Acatrinei, Sándor Pongor:
The SBASE protein domain library, release 6.0: a collection of annotated protein sequence segments. Nucleic Acids Res. 27(1): 257-259 (1999) - [j11]Csaba Szepesvári, Michael L. Littman:
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms. Neural Comput. 11(8): 2017-2060 (1999) - [j10]Zsolt Kalmár, Zsolt Marczell, Csaba Szepesvári, András Lörincz:
Parallel and robust skeletonization built on self-organizing elements. Neural Networks 12(1): 163-173 (1999) - 1998
- [j9]Csaba Szepesvári:
Non-Markovian Policies in Sequential Decision Problems. Acta Cybern. 13(3): 305-318 (1998) - [j8]Zsolt Kalmár, Csaba Szepesvári, András Lörincz:
Module-Based Reinforcement Learning: Experiments with a Real Robot. Auton. Robots 5(3-4): 273-295 (1998) - [j7]Csaba Szepesvári, András Lörincz:
An integrated architecture for motion-control and path-planning. J. Field Robotics 15(1): 1-15 (1998) - [j6]Zsolt Kalmár, Csaba Szepesvári, András Lörincz:
Module-Based Reinforcement Learning: Experiments with a Real Robot. Mach. Learn. 31(1-3): 55-85 (1998) - [c8]Zoltán Gábor, Zsolt Kalmár, Csaba Szepesvári:
Multi-criteria Reinforcement Learning. ICML 1998: 197-205 - [c7]Erich Sorantin, Ferdinand Schmidt, Heinz Mayer, Peter Winkler, Csaba Szepesvári:
Automated Detection and Classification of Micro-Calcifications in Mammograms Using Artifical Neural Nets. Digital Mammography / IWDM 1998: 225-232 - [c6]Peter Winkler, Erich Sorantin, Attila Tanács, Ferdinand Schmidt, Heinz Mayer, Csaba Szepesvári:
Performance-Evaluation for Automated Detection of Microcalcifications in Mammograms Using Three Different Film-Digitizers. Digital Mammography / IWDM 1998: 485-486 - 1997
- [j5]Csaba Szepesvári, Szabolcs Cimmer, András Lörincz:
Neurocontroller using dynamic state feedback for compensatory control. Neural Networks 10(9): 1691-1708 (1997) - [c5]Csaba Szepesvári:
Learning and Exploitation Do Not Conflict Under Minimax Optimality. ECML 1997: 242-249 - [c4]Zsolt Kalmár, Csaba Szepesvári, András Lörincz:
Module Based Reinforcement Learning: An Application to a Real Robot. EWLR 1997: 29-45 - [c3]Csaba Szepesvári:
The Asymptotic Convergence-Rate of Q-learning. NIPS 1997: 1064-1070 - 1996
- [j4]Tibor Fomin, Tamás Rozgonyi, Csaba Szepesvári, András Lörincz:
Self-Organizing Multi-Resolution Grid for Motion Planning and Control. Int. J. Neural Syst. 7(6): 757- (1996) - [j3]Csaba Szepesvári, András Lörincz:
Approximate geometry representations and sensory fusion. Neurocomputing 12(2-3): 267-287 (1996) - [c2]Csaba Szepesvári, András Lörincz:
Inverse Dynamics Controllers for Robust Control: Consequences for Neurocontrollers. ICANN 1996: 791-796 - [c1]Michael L. Littman, Csaba Szepesvári:
A Generalized Reinforcement-Learning Model: Convergence and Applications. ICML 1996: 310-318 - 1994
- [j2]Csaba Szepesvári, László Balázs, András Lörincz:
Topology Learning Solved by Extended Objects: A Neural Network Model. Neural Comput. 6(3): 441-458 (1994) - 1993
- [j1]Csaba Szepesvári, András Lörincz:
Behavior of an Adaptive Self-organizing Autonomous Agent Working with Cues and Competing Concepts. Adapt. Behav. 2(2): 131-160 (1993)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-09 13:04 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint