The increasing demand for gigabit-per-second speeds and higher wireless node density is driving the need for spatial reuse and the utilization of higher frequencies above the legacy sub-6 GHz bands. Since these super-6 GHz bands experience high path loss, directional beamforming has been the main method of access to the large amount of bandwidth available at these higher frequencies. Hence, the programming of wireless beams with specific directions is emerging as a requirement for software-defined radio (SDR) platforms. To address this need, we introduce an affordable millimeter-wave (mmWave) testbed. Using a multi-threaded software architecture, the testbed allows for the convenient programming of mmWave beam directions using a high-level programming language, while also providing access to machine learning (ML) libraries as well as SDR methods traditionally deployed in Universal Software Radio Peripheral (USRP) devices. To showcase the potential of the testbed, we tackle the Angle-of-Arrival (AoA) detection problem using reinforcement learning (RL) methods on the receiver side. AoA detection and direction finding is a crucial need for the emerging use of super-6 GHz spectra. We design and implement Q-learning, Double Q-learning, and Deep Q-learning algorithms that passively inspect the Received Signal Strength (RSS) of the mmWave beam and autonomously determine the predicted AoA. The results indicate the feasibility of programming directionality of the wireless beams via ML-based methods as well as solving difficult problems pertaining to emerging directional wireless systems.