Learning the worst response time
Response time analysis is an essential to understand the behavior of real-time systems. Static code analysis is often proposed, but tends not to scale to industrial systems. We propose using reinforcement learning to discover the execution scenarios that lead to the worst case response time.
This is the second paper by Mahshid in her very productive first six months. She continued to explore the potential for machine learning-based response time analysis for real-time systems at Bombardier as part of the TESTOMAT project. Just like in the preceding paper, the approach is based on Q-learning, i.e., a type of reinforcement learning. In the previous paper we envisioned using Q-learning for adaptive control, now we propose using it to learn the execution scenarios that cause the worst possible response times.
Response-time analysis is fundamental when developing critical real-time systems. Often static analysis is presented as an approach to find the worst-case execution time (WCRT), but it is often infeasible in practice – the approach doesn’t scale to the high complexity of industrial real-time systems.
In this short paper, we propose a new approach to finding the execution scenarios that lead to the WCRT. Our idea is to use a simulation-based method using Q-learning to identify the critical execution scenarios, and then to calculate the corresponding WCRT. A major advantage of this approach is that we don’t need to develop a detailed model of the running system and the execution environment – Q-learning is an example of model-free reinforcement learning.
Does it sound to good to be true? We don’t really know yet. This is just the first paper in which we propose the idea and formalize the application of Q-learning, including the computation of the reward signal that should ensure that the agent selects actions that leads to the execution scenario that brings the WCRT. The next paper will have to show some simulation results!
Implications for Research
- As an alternative to static analysis, we propose using reinforcement learning in a simulated environment to learn the worst possible execution scenario.
- We show how Q-learning can be used for learning-based response time analysis.
Mahshid Helali Moghadam, Mehrdad Saadatmand, Markus Borg, Markus Bohlin, and Björn Lisper. Learning-based Response Time Analysis in Real-Time Embedded Systems: A Simulation-based Approach, In Proc. of the 1st International Workshop on Software Qualities and their Dependencies, 2018. (link, preprint)
Response time analysis is an essential task to verify the behavior of real-time systems. Several response time analysis methods have been proposed to address this challenge, particularly for real-time systems with different levels of complexity. Static analysis is a popular approach in this context, but its practical applicability is limited due to the high complexity of the industrial real-time systems, as well as many unpredictable run-time events in these systems. In this work-in-progress paper, we propose a simulation-based response time analysis approach using reinforcement learning to find the execution scenarios leading to the worst-case response time. The approach learns how to provide a practical estimation of the worst-case response time through simulating the program without performing static analysis. Our initial study suggests that the proposed approach could be applicable in the simulation environments of the industrial real-time control systems to provide a practical estimation of the execution scenarios leading to the worst-case response time.