Abstract
The goal of this research was to develop a hybrid real-time problem-solving architecture that couples symbolic planning methods with connectionist reinforcement learning methods. The advantage of this hybrid architecture is that it can immediately achieve reasonable performance, because the symbolic planning system can quickly develop an acceptable control policy, but it can also gradually achieve optimal real-time performance, because the reinforcement learning system will eventually converge on a near-optimal policy. Many DoD problems would benefit from the ability to perform near-optimal real-time control of complex systems. DSÖ Qxssuiif 1 -u -"^UiMij i 14. SUBJECT TERMS real-time problem solving, machine learning, reinforcement learning, planning 17. SECURITY CLASSIFICATION OF REPORT unclassified
References (15)
- Reddy, C, Tadepalli, P. (1998). Learning First-Order Acyclic Horn Programs from Entailment. Proceedings of the Fifteenth International Conference on Machine Learning, Madison, Wis- consin. San Francisco: Morgan Kaufmann.
- Reddy, C, Tadepalli, P. (In press). Learning Horn Definitions: Theory and an Application to Planning. New Generation Computing. 16(4).
- Tadepalli, P. and Ok, D. Model-based Average Reward Reinforcement Learning. Artificial Intelli- gence, 100, 177-224, 1998.
- Zhang, W., Dietterich, T. G. (1998). Solving Combinatorial Optimization Tasks by Reinforcement Learning: A General Methodology Applied to Resource-Constrained Scheduling. Technical Report. Department of Computer Science, Oregon State University.
- Dietterich, T. G. (in press). Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation.
- Dietterich, T. G., Flann, N. S., (1997). Explanation-based Learning and Reinforcement Learning: A Unified View. Machine Learning, 28(2), 169-210.
- Reddy, C. and Tadepalli, P. (1997) Learning Horn Definitions using Equivalence and Membership Queries. International Conference on Inductive Logic Programming.
- Reddy, C. and Tadepalli, P. (1997). Learning goal-decomposition rules using exercises. In Pro- ceedings of the Fourteenth International Conference on Machine Learning. 278-286. San Francisco: Morgan Kaufmann.
- Reddy, C, Tadepalli, P. (1997) Inductive Logic Programming for Speedup Learning. Proceed- ings of the International Joint Conference on Artificial Intelligence Workshop on Frontiers of Inductive Logic Programming. Nagoya, Japan.
- Reddy, C, Tadepalli, P. (1997) Learning Horn Programs using Equivalence and Membership Queries. Proceedings of the International Joint Conference on Artificial Intelligence Workshop on Frontiers of Inductive Logic Programming. Nagoya, Japan.
- Tadepalli, P., Dietterich, T. G. (1997). Hierarchical Explanation-Based Reinforcement Learning. In Proceedings of the Fourteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann. 358-366.
- Zhang, W., Dietterich, T. G., (1996). High-Performance Job-Shop Scheduling With A Time-Delay TD(X) Network. In Advances in Neural Information Processing Systems, 8, 1024-1030.
- Zhang, W., Dietterich, T. G., (1995). A Reinforcement Learning Approach to Job-shop Scheduling. In 1995 International Joint Conference on Artificial Intelligence (pp. 1114-1120) Montreal, Canada.
- Dietterich, T. G., Flann, N. S., (1995). Explanation-based Learning and Reinforcement Learning: A Unified View. In Proceedings of the 12th International Conference on Machine Learning (pp. 176-184) Tahoe City, CA. San Francisco: Morgan Kaufmann.
- Online Information Available Postscript files for all papers are available via WWW from the following URL's: http://www.cs.orst.edu/"tgd/ http://www.cs.orst.edu/"tadepall/