Comparing Physics Effects through Reinforcement Learning in the ARORA Simulator

  • Troyle Thomas,
  • Armando Fandango,
  • Dean Reed ,
  • Clive Hoayun,
  • Jonathan Hurter,
  • Alexander Gutierrez,
  • Keith Brawner
  • a,b,c,d,e,f  1Institute for Simulation and Training, 3100 Technology Pkwy, Orlando, FL 32826, USA
  • U.S. Army Combat Capabilities Development Command Soldier Center (CCDC SC), 12423 Research Pkwy, Orlando, FL 32826, USA
Cite as
(a)Troyle Thomas, (b)Armando Fandango, (c)Dean Reed, (d)Clive Hoayun, (e)Jonathan Hurter, (f)Alexander Gutierrez, (g)Keith Brawner (2021). Comparing Physics Effects through Reinforcement Learning in the ARORA Simulator. Proceedings of the 33rd European Modeling & Simulation Symposium (EMSS 2021), pp. 107-115. DOI: https://doi.org/10.46354/i3m.2021.emss.015

Abstract

By testing various physics levels for training autonomous-vehicle navigation using a deep deterministic policy gradient algorithm, the present study fills a lack of research on the impact of physics levels for vehicle behaviour, specifically for reinforcement-learning algorithms. Measures from a PointGoal Navigation task were investigated: simulator run-time, training steps, and agent effectiveness through the Success weighted by (normalised inverse) Path Length (SPL) measure. Training and testing occurred in the novel simulator ARORA, or A Realistic Open environment for Rapid Agent training. The goal of ARORA is to provide a high-fidelity, open-source platform for simulation, using physics-based movement, vehicle modelling, and a continuous action space within a large-scale geospecific city environment. Using four physics levels, or models, to create four different curriculum conditions for training, the SPL was highest for the condition using all physics levels defined for the experiment, with two conditions returning zero values. Future researchers should consider providing adequate support when training complex-physics vehicle models. The run-time results revealed a benefit for experimental machines with a better CPU, at least for the vector-only observations we employed.

References

  1. Anderson, P., Chang, A., Chaplot, D. S., Dosovitskiy, A., Gupta, S., Koltun, V., Kosecka, J., Malik, J., Mottaghi, R., Savva, M., & Zamir, A. R. (2018). On evaluation of embodied navigation agents. ArXiv:1807.06757v1.
  2. Chaplot, D. S., Gandhi, D., Gupta, S., Gupta, A., & Salakhutdinov, R. (2020). Learning to explore using Active Neural SLAM. ArXiv:2004.05155v1.
  3. Dosovitskiy, A., Ros, G., Codevilla, F., López, A., & Koltun, V. (2017). CARLA: An open urban driving simulator. 1st Conference on Robot Learning (CoRL 2017), 1–16.
  4. Intel Corporation. (n.d.-a). Intel® Xeon® Gold 5220R Processor (35.75M Cache, 2.20 GHz). Retrieved July 23, 2021, from https://www.intel.com/content/www/us/en/products/sku/199354/intel-xeon-gold-5220r-processor-35-75m-cache-2-20-ghz/specifications.html
  5. Intel Corporation. (n.d.-b). Intel® Xeon® Gold 6248 Processor. Retrieved July 22, 2021, from https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html
  6. Kadian, A., Truong, J., Gokaslan, A., Clegg, A., Wijmans, E., Lee, S., Savva, M., Chernova, S., & Batra, D. (2020). Sim2real predictivity: Does evaluation in simulation predict real-world performance? IEEE Robotics and Automation Letters, 5(4), 6670–6677.
  7. Kang, C. M., Lee, S.-H., & Chung, C. C. (2014). Comparative evaluation of dynamic and kinematic vehicle models. 53rd IEEE Conference on Decision and Control, 648–653. 
  8. Kolbe, T. H. (2012). CityGML. http://www.citygml.org/citygml.org.html
  9. Kolve, E., Mottaghi, R., Han, W., VanderBilt, E., Weihs, L., Herrasti, A., Gordon, D., Zhu, Y., Gupta, A., & Farhadi, A. (2019). AI2-THOR: An interactive 3D environment for visual AI. ArXiv:1712.05474v3.
  10. Kong, J., Pfeiffer, M., Schildbach, G., & Borrelli, F. (2015). Kinematic and dynamic vehicle models for autonomous driving control design. 2015 IEEE Intelligent Vehicles Symposium (IV), 1094–1099.
  11. Lehto, H. S., & Hedlund, R. (2019). Impact of vehicle dynamics modelling on feature based SLAM for autonomous racing: A comparative study of the kinematic and dynamic vehicle models. KTH Royal Institute of Technology.
  12. Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016). Continuous control with deep reinforcement learning. 4th International Conference on Learning Representations, ICLR 2016. https://arxiv.org/abs/1509.02971
  13. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with deep reinforcement learning. ArXiv:1312.5602v1, 1–9.
  14. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518, 529–533.
  15. PEO STRI. (n.d.). One World Terrain (OWT). Retrieved May 17, 2021, from https://peostri.army.mil/one-world-terrain-owt
  16. Polack, P., Altche, F., D’Andrea-Novel, B., & de La Fortelle, A. (2017). The kinematic bicycle model: A consistent model for planning feasible trajectories for autonomous vehicles? 2017 IEEE Intelligent Vehicles Symposium (IV), 812–818.
  17. Reed, D., Thomas, T., Eifert, L., Reynolds, S., Hurter, J., & Tucker, F. (2018). Leveraging virtual environments to train a deep learning algorithm. 17th International Conference on Modeling and Applied Simulation, MAS 2018, 48–54.
  18. Savva, M., Kadian, A., Maksymets, O., Zhao, Y., Wijmans, E., Jain, B., Straub, J., Liu, J., Koltun, V., Malik, J., Parikh, D., & Batra, D. (2019). Habitat: A platform for embodied AI research. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 9338–9346.
  19. Schubert, R., Richter, E., & Wanielik, G. (2008). Comparison and evaluation of advanced motion models for vehicle tracking. 2008 11th International Conference on Information Fusion.
  20. Shah, S., Dey, D., Lovett, C., & Kapoor, A. (2017). AirSim: High-fidelity visual and physical simulation for autonomous vehicles. ArXiv:1705.05065v2, 1–14.
  21. Shen, B., Xia, F., Li, C., Martín-Martín, R., Fan, L., Wang, G., Buch, S., D’Arpino, C., Srivastava, S., Tchapmi, L. P., Tchapmi, M. E., Vainio, K., Fei-Fei, L., & Savarese, S. (2020). iGibson, a simulation environment for interactive tasks in large realistic scenes. ArXiv:2012.02924v2.
  22. Technical University of Munich Chair of Geoinformatics. (n.d.). Semantic 3D City Model of Berlin. The CityGML Database 3D City DB. Retrieved May 18, 2021, from https://www.3dcitydb.org/3dcitydb/visualizationberlin/
  23. Thomas, T., Hurter, J., Winston, T., Reed, D., & Eifert, L. B. (2020). Using a virtual dataset for deep learning: Improving real-world environment re-creation for human training. 19th International Conference on Modeling and Applied Simulation, MAS 2020, 26–33.
  24. Yokoyama, N., Ha, S., & Batra, D. (2021). Success weighted by Completion Time: A dynamics-aware evaluation criteria for embodied navigation. ArXiv:2103.08022v1.