Modern distribution networks are undergoing several technical challenges, such as voltage fluctuations, because of high penetration of distributed energy resources (DERs). This paper proposes a deep reinforcement learning (DRL)-based Volt- VAR co-optimization technique for reducing voltage fluctuations as well as power loss under high penetration of DERs. In addition, the proposed approach minimizes the operational cost of the grid. A stochastic policy optimization based soft actor critic (SAC) agent is proposed to configure the optimal set-points of the reactive power outputs of the inverters. The performance of the proposed model is verified on the modified IEEE 34- and 123-bus systems and compared with a base case scenario with no reactive supply by inverters, and a local droop control approach. The results demonstrate that the proposed framework outperforms the conventional droop control method in improving the voltage profile, minimizing the network power loss, and reducing grid operational cost.