Water level variations influence the biochemical and hydrological processes within river networks. Through river cameras, obtaining reliable water segmentation from image data can practically support the monitoring of water level. However, limited annotated data and tedious local deployment restrict the applicability of current deep learning water segmentation models in new river scenarios. To pursue transferability, this study proposes a novel framework that combines domain-specific models with General AI for water segmentation. The framework utilizes a ResUnet model pre-trained on a non-local dataset to identify the pixel with the highest probability of being water from the image. The Segment Anything Model (SAM), a promptable foundational computer vision model developed by Meta AI, is then adopted to use the pixel as prompt for generating water masks. When prompted, different modes of SAM are used for comparison. We applied the framework to image sequences acquired from river cameras stationed at four locations in Tewkesbury, UK. The framework significantly improved segmentation performance, with an increase of over 15% in Intersection over Union (IoU) over the single ResUnet model. Meanwhile, the results substantiated point prompt as the optimal mode for feeding prior knowledge on water to SAM. The static observer flooding index (SOFI) time series calculated based on the framework’s segmented masks under point prompt mode exhibit an average correlation of 0.90 with real water level fluctuations, significantly surpassing the correlation of 0.54 attained by ResUnet. Our study thus represents a step toward implementing river cameras for robust water level trend monitoring.