[1]
|
J. A. Boyan and A. W. Moore, “Generalization in rein-forcement learning: Safely approximating the value function,” in Advances in Neural Information Processing Systems 7, The MIT Press, pp. 369-376, 1995.
|
[2]
|
R. S. Sutton, “Generalization in reinforcement learning: Successful examples using sparse coarse coding,” in Advances in Neural Information Processing Systems, edited by David S. Touretzky, Michael C. Mozer, and Michael E. Hasselmo, The MIT Press, Vol. 8, pp. 1038-1044, 1996.
|
[3]
|
W. D. Smart and L. P. Kaelbling, “Practical reinforce-ment learning in continuous spaces,” in ICML’00: Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., pp. 903-910, 2000.
|
[4]
|
M. G. Lagoudakis, R. Parr, and M. L. Littman, “Least- squares methods in reinforcement learning for control,” in Proceedings of Methods and Applications of Artificial Intelligence: Second Hellenic Conference on AI, SETN 2002, Thessaloniki, Greece, Springer, pp. 752-752, April 11-12, 2002.
|
[5]
|
K. Doya, “Reinforcement learning in continuous time and space,” Neural Computation, Vol. 12, pp. 219-245, 2000.
|
[6]
|
P. Wawrzyński and A. Pacut, “Model-free off-policy reinforcement learning in continuous environment,” in Proceedings of the INNS-IEEE International Joint Conference on Neural Networks, pp. 1091-1096, 2004.
|
[7]
|
J. Morimoto and K. Doya, “Robust reinforcement learning,” Neural Computation, Vol. 17, pp. 335-359, 2005.
|
[8]
|
G. Boone, “Efficient reinforcement learning: Model- based acrobot control,” in International Conference on Robotics and Automation, pp. 229-234, 1997.
|
[9]
|
X. Xu, D. W. Hu, and X. C. Lu, “Kernel-based least squares policy iteration for reinforcement learning,” IEEE Transactions on Neural Networks, Vol. 18, pp. 973-992, 2007.
|
[10]
|
J. C. Santamaría, R. S. Sutton, and A. Ram, “Experiments with reinforcement learning in problems with continuous state and action spaces,” Adaptive Behavior, Vol. 6, pp. 163-217, 1997.
|
[11]
|
J. D. R. Millán and C. Torras, “A reinforcement connec-tionist approach to robot path finding in non-maze-like environments,” Machine Learning, Vol. 8, pp. 363-395, 1992.
|
[12]
|
T. Fukao, T. Sumitomo, N. Ineyama, and N. Adachi, “Q-learning based on regularization theory to treat the continuous states and actions,” in the 1998 IEEE International Joint Conference on Neural Networks Proceedings, IEEE World Congress on Computational Intelligence, Vol. 2, pp. 1057-1062, 1998.
|
[13]
|
R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction of adaptive computation and machine learning,” The MIT Press, March 1998.
|
[14]
|
L. C. Baird and A. H. Klopf, “Reinforcement learning with high-dimensional, continuous actions,” Technical Report, WL-TR-93-1147, Wright-Patterson Air Force Base Ohio: Wright Laboratory, 1993.
|
[15]
|
J. D. R. Millán, D. Posenato, and E. Dedieu, “Continu-ous-action q-learning,” Machine Learning, Vol. 49, pp. 247-265, 2002.
|
[16]
|
H. Arie, J. Namikawa, T. Ogata, J. Tani, and S. Sugano. “Reinforcement learning algorithm with CTRNN in con-tinuous action space,” in Proceedings of Neural Informa-tion Processing, Part 1, Vol. 4232, pp. 387-396, 2006.
|
[17]
|
R. M. Kretchmar and C. W. Anderson, “Comparison of CMACS and radial basis functions for local function approximators in reinforcement learning,” in International Conference on Neural Networks, pp. 834-837, 1997.
|
[18]
|
R. Dembo and T. Steihaug, “Truncated-Newton algorithms for large-scale unconstrained optimization,” Mathematical Programming, Vol. 26, pp.190-212, 1983.
|
[19]
|
R. S. Sutton, “Learning to predict by the methods of temporal differences,” Machine Learning, Vol. 3, pp. 9-44, 1988.
|