As a key component of inherent optical properties (IOPs) in ocean color remote sensing, phytoplankton absorption coefficient ($a_{phy}$), especially in hyperspectral, greatly enhances our understanding of phytoplankton community composition (PCC). The recent launches of NASA's hyperspectral missions, such as EMIT and PACE, have created an urgent need for hyperspectral algorithms for studying phytoplankton. Retrieving $a_{phy}$ from ocean color remote sensing in coastal waters has been extremely challenging due to complex optical properties, causing traditional methods to fail, and highlighting the need to improve machine learning approaches, which are hindered by data scarcity, heterogeneity, and noises from data collection. To address this, a novel machine learning framework for hyperspectral retrievals of $a_{phy}$ , based on the mixture-of-experts (MOE), named PhA-MOE, is introduced in this study. Various preprocessing methods for hyperspectral training data are explored, with the combination of robust and logarithmic scalers identified as optimal. In this study, PhA-MOE for $a_{phy}$ prediction is tailored to both past and current hyperspectral missions, including HICO, EMIT, and PACE. Extensive experiments reveal the importance of data pre-processing and improved performance of PhA-MOE in estimating $a_{phy}$ as well as in handling the data heterogeneity.