References

  1. B. D. Argall, S. Chernova, M. Veloso & B. Browning (2009): A survey of robot learning from demonstration. Robotics and autonomous systems 57(5), pp. 469–483, doi:10.1016/j.robot.2008.10.024.
  2. A. Barto & S. Mahadevan (2003): Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Systems Journal 13, pp. 41–77, doi:10.1023/A:1022140919877.
  3. S. Bhatnagar, R. Sutton, M. Ghavamzadeh & M. Lee (2009): Natural Actor-Critic Algorithms. Automatica 45(11), pp. 2471–2482, doi:10.1016/j.automatica.2009.07.008.
  4. A. Cimatti, M. Pistore & P. Traverso (2008): Automated planning. In: Frank van Harmelen, Vladimir Lifschitz & Bruce Porter: Handbook of Knowledge Representation. Elsevier, doi:10.1016/S1574-6526(07)03022-2.
  5. S. T. Erdoğan (2008): A Library of General-Purpose Action Descriptions. University of Texas at Austin.
  6. S. Griffith, K. Subramanian, J. Scholz, C. L. Isbell & A. L. Thomaz (2013): Policy shaping: Integrating human feedback with reinforcement learning. In: Advances in neural information processing systems (NeurIPS), pp. 2625–2633.
  7. M. Hanheide, M. Göbelbecker & G. S Horn (2015): Robot task planning and explanation in open and uncertain worlds. Artificial Intelligence, doi:10.1016/j.artint.2015.08.008.
  8. M. Helmert (2006): The fast downward planning system. Journal of Artificial Intelligence Research 26, pp. 191–246, doi:10.1613/jair.1705.
  9. C. Hogg, U. Kuter & H. Munoz-Avila (2010): Learning Methods to Generate Good Plans: Integrating HTN Learning and Reinforcement Learning.. In: Association for the Advancement of Artificial Intelligence (AAAI).
  10. D. Inclezan & M. Gelfond (2016): Modular action language ALM. Theory and Practice of Logic Programming 16(2), pp. 189–235, doi:10.1080/11663081.2013.798954.
  11. Y. Jiang, F. Yang, S. Zhang & P. Stone (2019): Task-Motion Planning with Reinforcement Learning for Adaptable Mobile Service Robots.. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
  12. P. Khandelwal, F. Yang, M. Leonetti, V. Lifschitz & P. Stone (2014): Planning in Action Language BC while Learning Action Costs for Mobile Robots.. In: International Conference on Automated Planning and Scheduling (ICAPS).
  13. P. Khandelwal, S. Zhang, J. Sinapov, M. Leonetti, J. Thomason, F. Yang, I. Gori, M. Svetlik, P. Khante & V. Lifschitz (2017): BWIBots: A platform for bridging the gap between AI and human–robot interaction research. The International Journal of Robotics Research 36(5-7), pp. 635–659, doi:10.1007/978-3-319-23264-5_42.
  14. W. B. Knox & P. Stone (2009): Interactively shaping agents via human reinforcement: The TAMER framework. In: Proceedings of the fifth International Conference on Knowledge Capture. ACM, pp. 9–16, doi:10.1145/1597735.1597738.
  15. W. B. Knox & P. Stone (2010): Combining manual feedback with subsequent MDP reward signals for reinforcement learning. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, pp. 5–12.
  16. W. B. Knox & P. Stone (2012): Reinforcement learning from simultaneous human and MDP reward. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, pp. 475–482.
  17. J. Lee, V. Lifschitz & F. Yang (2013): Action Language BC: A Preliminary Report. In: International Joint Conference on Artificial Intelligence (IJCAI).
  18. M. Leonetti, L. Iocchi & F. Patrizi (2012): Automatic generation and learning of finite-state controllers. In: International Conference on Artificial Intelligence: Methodology, Systems, and Applications. Springer, pp. 135–144, doi:10.1007/3-540-61474-5_68.
  19. M. Leonetti, L. Iocchi & P. Stone (2016): A synthesis of automated planning and reinforcement learning for efficient, robust decision-making. Artificial Intelligence 241, pp. 103–130, doi:10.1016/j.artint.2016.07.004.
  20. V. Lifschitz & W. Ren (2006): A modular action description language. In: Association for the Advancement of Artificial Intelligence (AAAI), pp. 853–859.
  21. D. Lyu, F. Yang, B. Liu & S. Gustafson (2019): SDRL: Interpretable and Data-efficient Deep Reinforcement LearningLeveraging Symbolic Planning. In: Association for the Advancement of Artificial Intelligence (AAAI).
  22. J. MacGlashan, M. K Ho, R. Loftin, B. Peng, G. Wang, D. L. Roberts, M. E. Taylor & M. L. Littman (2017): Interactive Learning from Policy-Dependent Human Feedback. In: International Conference on Machine Learning (ICML).
  23. J. MacGlashan, M. L. Littman, D. L. Roberts, R. Loftin, B. Peng & M. E. Taylor (2016): Convergent Actor Critic by Humans. In: International Conference on Intelligent Robots and Systems.
  24. John McCarthy (1987): Generality in Artificial Intelligence. Communications of the ACM (CACM), doi:10.1145/33447.33448.
  25. V. Mnih, K. Kavukcuoglu, D. Silver, A. A Rusu, J. Veness, M. G Bellemare, A. Graves, M. Riedmiller, A. K Fidjeland & G. Ostrovski (2015): Human-level control through deep reinforcement learning. Nature 518(7540), pp. 529–533, doi:10.1016/S0004-3702(98)00023-X.
  26. A. Y. Ng & S. J. Russell (2000): Algorithms for inverse reinforcement learning.. In: International Conference on Machine Learning (ICML) 1, pp. 2.
  27. R. Parr & S. J. Russell (1998): Reinforcement learning with hierarchies of machines. In: Advances in neural information processing systems (NeurIPS), pp. 1043–1049.
  28. J. Peters & S. Schaal (2008): Natural actor-critic. Neurocomputing 71(7), pp. 1180–1190, doi:10.1016/j.neucom.2007.11.026.
  29. S. Rosenthal, M. M. Veloso & A. K. Dey (2011): Learning Accuracy and Availability of Humans Who Help Mobile Robots.. In: Association for the Advancement of Artificial Intelligence (AAAI).
  30. S. L. Rosenthal (2012): Human-centered planning for effective task autonomy. Technical Report. CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE.
  31. M. R.K. Ryan (2002): Using abstract models of behaviours to automatically generate reinforcement learning hierarchies. In: In Proceedings of The 19th International Conference on Machine Learning (ICML). Morgan Kaufmann, pp. 522–529.
  32. M. R.K. Ryan & M. D. Pendrith (1998): RL-TOPs: An Architecture for Modularity and Re-Use in Reinforcement Learning. In: In Proceedings of the Fifteenth International Conference on Machine Learning (ICML). Morgan Kaufmann, pp. 481–487.
  33. J. Schulman, S. Levine, P. Abbeel, M. Jordan & P. Moritz (2015): Trust region policy optimization. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 1889–1897.
  34. J. Schulman, P. Moritz, S. Levine, M. Jordan & P. Abbeel (2015): High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438.
  35. A. Schwartz (1993): A Reinforcement Learning Method for Maximizing Undiscounted Rewards. In: International Conference on Machine Learning (ICML). Morgan Kaufmann, San Francisco, CA, doi:10.1016/B978-1-55860-307-3.50045-9.
  36. R. S. Sutton & A. G. Barto (2018): Reinforcement learning: An introduction. MIT press.
  37. R. S. Sutton, D. Precup & S. Singh (1999): Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence 112(1-2), pp. 181–211, doi:10.1016/S0004-3702(99)00052-1.
  38. A. L. Thomaz & C. Breazeal (2008): Teachable robots: Understanding human teaching behavior to build more effective robot learners. Artificial Intelligence 172(6-7), pp. 716–737, doi:10.1016/j.artint.2007.09.009.
  39. A. L. Thomaz & C. Breazeal (2006): Reinforcement learning with human teachers: Evidence of feedback and guidance with implications for learning performance. In: Aaai 6. Boston, MA, pp. 1000–1005.
  40. P. A. Tsividis, T. Pouncy, J. L. Xu, J. B. Tenenbaum & S. J. Gershman (2017): Human learning in Atari.
  41. R. J Williams (1992): Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8(3-4), pp. 229–256, doi:10.1023/A:1022672621406.
  42. F. Yang, D. Lyu, B. Liu & S. Gustafson (2018): PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making. In: International Joint Conference of Artificial Intelligence (IJCAI), doi:10.24963/ijcai.2018/675.

Comments and questions to: eptcs@eptcs.org
For website issues: webmaster@eptcs.org