This paper proposes an inflow scenario reduction framework applied in long-term hydro-thermal scheduling. The strategic management of limited stored hydro energy in interaction with the electricity system is defined in long-term hydro-thermal. The Scenario Fan Problem (SFP) is selected as a long-term hydro-thermal scheduling approach. This approach is suitable to represent the disaggregated physical representation of the hydro stations and captures the short-term flexibility of individual hydropower plants in the long-term scheduling. We develop a shape-based feature extraction machine learning method to capture the hydro inflow features. To highlight the performance, we compare the shape-based method with five alternative clustering methods via the validation process on a hydro-thermal test case, using in-sample and out-of-sample data analysis. Our results show that the proposed framework can significantly reduce the computational time of long-term hydro-thermal scheduling by around 90\%. Results reveal that the shape-based feature extraction performs better regarding the computational time and the expected reservoir volumes. Moreover, this study demonstrates the possibility of performing long-term hydro-thermal scheduling at a disaggregated level with detailed topology information and a small time scale. This provides more opportunities to deal with the penetration of renewable energy resources in long-term hydro-thermal scheduling.