In this paper, an efficient all-digital demodulator for multi-h CPM is proposed based on a low-complexity decision-directed synchronization algorithm. We derive the maximum-likelihood estimation of the carrier phase and timing errors and propose a reduced-complexity timing error discriminator with the linear phase approximation (LPA). Compared to the traditional synchronization method, it reduces about 2/3 of matched filter banks since only on-time matched filter banks are required. To demonstrate the stability of the proposed algorithm, the S-curve is derived theoretically and plotted with numerical simulation. The numerical results show that the LPA-based synchronization algorithm has no loss of bit error rate (BER) performance compared to the commonly used methods. Then, the LPA-based synchronization and the maximum-likelihood sequence detection for the three kinds of promising multi-h CPM demodulation are implemented on a Xilinx Kirtex-7 FPGA platform achieving the global clock rate of 200MHz. The BER of the overall receiver is tested on board, and the results show that its performance has an ignorable loss to the numerical simulation. The implementation overhead on FPGA is reduced by about 30% compared to the conventional method.