This paper introduces a novel hybrid precoding architecture, along with several hardware-aware delay-phase precoding schemes, that address the beam squint effect while complying with the limitations of true time delay (TTD) elements and phase shifters (PSs). The proposed architecture significantly alleviates the burden of the maximum delay range constraint imposed on TTDs. Additionally, we address the time resolution constraint of TTDs by analyzing the resulting resolution error and compensating for it in the phase domain. Similarly, we account for the resolution constraint of PSs by proposing a practical precoding scheme that jointly optimizes TTDs and PSs, taking into consideration their hardware limitations. Specifically, we formulate the design of the hybrid analog precoder under TTDs and PSs constraints as a mixed-integer optimization problem and propose an alternating minimization-based iterative algorithm to solve it. To reduce the computational complexity of this algorithm, we eventually propose a low-complexity precoding algorithm that maintains satisfactory performance with minimal computational overhead. We would like to mention that a part of this work has been accepted for presentation at IEEE ICC 20231. In our previous work at IEEE ICC 2023, a preliminary investigation of the hybrid analog precoding under the hardware limitations of TTDs is presented, where the finite-resolution limitation of TTDs is primarily addressed in the design of the hybrid analog precoder. The work in IEEE ICC 2023 proposes an iterative delay-phase precoding algorithm, that achieves remarkably higher array gain compared to a recent state-of-art work in the literature. The efficacy of this algorithm is validated in 2D scenario assuming uniform linear arrays (ULAs) in IEEE ICC 2023, whereas it is further evaluated in 3D scenario assuming uniform planar arrays (UPAs) in a conference paper, submitted to IEEE IWMTS 20232. Nevertheless, this algorithm suffers array gain loss as the maximum delay range supported by TTDs decreases. In the same vien, a remarkable gain loss occurs with the adoption of higher number of antenna elements, which makes this algorithm unscalable for UM-MIMO systems. Motivated by these drawbacks, we propose in this work a novel delay-phase precoding architecture that alleviates the effect of the maximum delay range limitation of TTDs to high extent surpassing our work in IEEE ICC 2023. The proposed architecture does not require higher time delay values as the number of antennas increases, which makes it suitable for UM-MIMO systems. Moreover, in addition to the hardware limitations of TTDs, we consider in this work the hardware limitations of PSs where impractical infinite-resolution PSs are assumed.