Chemical process design is the search for an optimal manufacturing protocol to perform chemical operations. For transient processes such as crystallization, the optimal conditions can change over time, requiring a dynamic strategy. Model-free deep reinforcement learning is an approach that can be used to identify the best sequence of states with respect to a predefined reward function. In this work, proximal policy optimization is applied in a simulated environment to identify operational strategies that are optimal with respect to the desired particle properties in unseeded batch cooling crystallization processes of paracetamol in ethanol. For this purpose, the corresponding Markov decision process is formulated, and it is shown that the method is promising for the development of novel routes that allow the tuning of particle size (623 μm) and provide high yields (96%) within a defined period of time (12 h).