Satellite operators worldwide are in a race to deploy and enhance connectivity supporting diverse 5G applications and services, with success contingent upon delivering superior Quality of Experience (QoE) tailored to each service within the constraints of limited network capacity (Mbts). However, this endeavor's success is hindered by unpredictably fluctuating traffic demands, distinct packet arrival distributions across services, and evolving stochastic quality expectations of users. This paper addresses these challenges by formulating a statistical optimization problem that minimizes allocated capacity (intending to accommodate more users) while meeting specific QoE requirements, such as queuing delay. To achieve this, we leverage packet queuing analysis within the buffer system of the SatCom gateway's forward link. Given the complexity of directly addressing the formulated problem, we resort to approximating its constraints using the probability of occurrence. We propose a multi-agent Double Deep Q-Network (DDQN) algorithm, enabling a more precise representation of queue-length states and facilitating more informed decisionmaking by the agents. The approach leverages episodic training, ensuring agents are thoroughly prepared and optimized through simulations before being deployed in a real-time environment. Extensive simulation campaigns validate the efficiency of our method, showcasing its superiority over benchmark algorithms.