This paper proposed a multi-dimensional search space (or directional space) which has larger degree-of-freedom (DOF) to improve energy efficiency of limited-battery-powered UAV in Internet of Things (IoT) data collection scenario. In this paper, the UAV navigates from initial to goal point while collecting data from IoT sensors on the ground. Due to limited battery-power of UAV, optimized trajectory becomes a crucial practical issue. Based on the available directional space, direction of UAV which related to navigation trajectory is optimized using reinforcement learning. The objective of this reinforcement learning is to maximize energy efficiency of UAV as a long-term reward by selecting optimized direction. Moreover, practical energy consumption model and environment are presented in this paper. The simulation results verify the proposed scheme that has larger directional space achieved higher energy efficiency compared to benchmark models.