Federated learning (FL) performs collaborative training of deep learning models among multiple clients, safeguarding data privacy, security, and legal adherence by preserving training data locally. Despite the benefits of FL, its wider implementation is hindered by communication overheads and potential privacy risks. Transiting locally updated model parameters between edge clients and servers demands high communication bandwidth, leading to high latency and Internet infrastructure constraints. Furthermore, recent works have shown that the malicious server can reconstruct clients’ training data from gradients, significantly escalating privacy threats and violating regularizations. Different defense techniques have been proposed to address this information leakage from the gradient or updates, including introducing noise to gradients, performing model compression (such as sparsification), and feature perturbation. However, these methods either impede model convergence or entail substantial communication costs, further exacerbating the communication demands in FL. To develop an efficient and privacy-preserving FL, we introduce an autoencoder-based method for compressing and, thus, perturbing the model parameters. The client utilizes an autoencoder to acquire the representation of the local model parameters and then shares it as the compressed model parameters with the server, rather than the true model parameters. The use of the autoencoder for lossy compression serves as an effective protection against information leakage from the updates. Additionally, the perturbation is intrinsically linked to the autoencoder’s input, thereby achieving a perturbation with respect to the parameters of different layers. Moreover, our approach can reduce 4.1 × the communication rate compared to federated averaging. We empirically validate our method using two widely-used models within the context of federated learning, considering three datasets, and assess its performance against several well-established defense frameworks. The results indicate that our approach attains a model performance nearly identical to that of unmodified local updates, while effectively preventing information leakage and reducing communication costs in comparison to other methods, including noisy gradients, gradient sparsification, and PRECODE.