Opportunity for Master Thesis: Reinforcement Learning for End-to-End Autonomous Driving

Autonomous vehicles becomes more and more common, and the demand for enabling technologies like machine learning is high. This project will study End-to-End autonomous driving powered by reinforcement learning. Databases with real-traffic video data captured from on-board sensors are currently available (e.g. from Udacity and comma.ai). This video data, however, is fixed, and control actions different from the video are hard to evaluate.

Instead of using video data to train a system, we would like to investigate driving simulators, like racing games VDrift and TORCS. Game screenshots would replace video streams from the camera. By using the image data as input, and the vehicle steering and acceleration as output, it is possible to build a machine learning system that is capable of learning to control the vehicle to stay on the road and keep the speed limits.

Project questions

  • What are the key features of Reinforcement Learning frameworks that allow rapid prototyping and repeatable comparisons of different learning algorithms? You will investigate the design and possibilities of Reinforcement Learning frameworks like OpenAI Gym, ViZDoom, The Arcade Learning Environment and RL Competition, and implement either a new framework or a new environment for an existing framework based on a vehicle simulator like VDrift, TORCS or the VICTA LAB simulator.
  • Based on a literature survey, which of the existing machine learning techniques are suitable for vehicle control? Picking a few promising candidates, how good are they relative to each other under a detailed quantitative comparison? In this part you will explore the potential of Reinforcement Learning and/or other types of Machine Learning algorithms (semi-supervised learning, inverse reinforcement learning, apprenticeship learning etc.) for vehicle control.
  • Can unsupervised learning algorithm come close to a hand-tuned specialized algorithm? TORCS Racing Board accepts submissions of virtual "drivers" or robots, and training such a robot using several algorithms can be a way to answer it.
  • How well a control policy learned in a virtual environment can be used in a similar but physical environment? Explore the possibility to use the Neural Network trained in a virtual environment to control e.g. a miniature vehicle.


Alexey Voronov and Cristofer Englund.

About us

Viktoria Swedish ICT, member of RISE, is a non-profit research institute dedicated to enable sustainable mobility by the use of information and communication technology (ICT). We work to eliminate fossil dependency, accidents, and impact on climate and environment. To become a stronger innovation partner for businesses and society we are now merging with other RISE institutes, and at the end of the year we will change our name to RISE.