Explanation:
The hard part of verifiably constrained reinforcement learning is getting the constraints right.
This web demo shows the difference between the definition of safety
and the constraints on the state-action space necessary to maintain safety.
▶️⏹️
The colored bars at the botton of the screen indicate the regions where the car
may accelerate according to the safety guard (green), may accelerate or brake
according to the safety guard (yellow), and where a crash has already happened (red).
todo: make red also indicate where a crash is immenent but has not already happened.
if(
leader_pos - follower_pos) > SAFE_SEP
)
{
available_actions := {accelerate, brake};
}
else
{
available_actions := {brake};
}
t := 0;
choose action from available_actions;
{ follower_pos' = rvel, rvel' = action & t <= T }