Reinforcement learning (RL) algorithms enable automated controller configuration without plant-specific parameter knowledge and allow consideration of multiple objectives in a single training phase. Emerging developments in the domain of electric drives feature a holistic design approach and exhibit efficient drive operation after only few minutes of learning. The exploration required for this includes random actions to collect informative data during the training phase, which may lead to unsafe state reactions and thus drive damage. Accordingly, this contribution focuses on safety-guaranteed RL-based direct torque control (DTC) for a modulator-driven permanent magnet synchronous motor, i.e., the applicable voltage domain is considered as a continuous control set (CCS). A safeguarding algorithm is derived to secure the online training phase, preventing any voltage with unwanted consequences from being applied to the motor, e.g., to avoid overcurrent. The proposed CCS-RL-DTC is applied and evaluated on a real-world test bench, where comprehensive functionality and performance investigations are conducted with concern to the safeguard, the online training phase and the resulting torque-tracking behavior.