Accurate estimation of solar-induced fluorescence (SIF) from passively sensed hyperspectral remote sensing data has been identified as instrumental in assessing the photosynthetic activity of plants for various scientific and ecological applications at various spatial scales. Different techniques to derive SIF have been developed over the last decades. In view of ESA's upcoming Earth Explorer satellite mission FLEX aiming to provide high-quality global imagery for SIF retrieval an increased interest is placed in physical approaches. We present a novel method to retrieve SIF in the O2-A absorption band of hyperspectral imagery acquired by the HyPlant sensor system. It aims at a tight integration of physical radiative transfer principles and self-supervised neural network training. To this end, a set of spatial and spectral constraints and a specific loss formulation are adopted. In a validation study we find good agreement between our approach and established retrieval methods as well as with in-situ top-of-canopy SIF measurements. In two application studies, we additionally find evidence that the estimated SIF (i) satisfies a first-order model of diurnal SIF variation and (ii) locally adapts the estimated optical depth in topographically variable terrain.