Average episode time steps for five real-world trials. It can be seen that our method, residual recurrent TD3 with impedance controller, learns to complete the task in fewer time steps than other methods.