We consider a downlink situation where the BS is equipped with four antennas (M = 4) that serve single users; and assume a single cell. We obtain the dataset from the channel realizations randomly generated from a normal distribution with zero mean and unit variance. The dataset is reshaped and converted to real number domain. The input dataset is normalized by the transmit data symbol so that data entries are within the nominal range, potentially aiding the training. We generate 50,000 training samples and 2000 test samples, respectively. The transmit data symbols are modulated using a QPSK modulation scheme. The training SINR is obtained randomly from uniform distribution Γtrain∼U(Γlow, Γhigh). Stochastic gradient descent is used with the Lagrangian function as a loss metric. A parametric rectified linear unit (PReLu) activation function is used for convolutional and fully connected layers in a full-precision model and the low-bit activation function for the quantized model. After every iteration, the learning rate is reduced by a factor α= 0.65 to help the learning algorithm converge faster.