Zero-energy radios in energy-constrained devices are envisioned as key enablers to realizing the next-generation Internet-of-things (NG-IoT) networks for ultra-dense sensing and monitoring. This paper presents analytical modeling and analysis of the energy-efficient uplink transmission of an energyconstrained secondary sensor operating opportunistically among several primary sensors. The considered scenario assumes that all primary sensors transmit in a round-robin, time division multiple access-based schemes, and the secondary sensor is admitted in the time slot of each primary sensor using a nonorthogonal multiple access technique, inspired by cognitive radio. The energy efficiency of the secondary sensor is maximized by exposing it to a deep reinforcement learning-based algorithm, recognized as a deep deterministic policy gradient (DDPG). Our results demonstrate that the DDPG-based transmission scheme outperforms the conventional random and greedy algorithms in terms of energy efficiency at different operating conditions.