To support the development of wearable medical devices for remote monitoring and treatment of cardiovascular diseases, we tackle the data scarcity problem that hinders the application of machine learning methods. We propose a self-supervised approach applied to cardiological signals, which benefits from existing datasets despite differences between them and inconsistencies within them. We develop a specific implementation: a diffusion autoencoder with a semantic encoder based on linear recurrent units, trained on ECG signals (various leads mixed together) without any annotations. The semantic encoder is evaluated as a feature extractor by measuring classification metrics of a logistic regression on a dataset not included in the self-supervised training. We obtain promising results and propose future directions.