Raman spectroscopy, as a label-free sensor, is commonly used for real-time monitoring of key parameters in the cultivation of recombinant protein. However, ensuring accurate parameter values necessitates a large quantity of offline measurement data, which is time-consuming and labor-intensive. In order to address the limitations of conventional complex data preprocessing, this study considers a genetic algorithm-based semi-supervised convolutional neural network (GA-SCNN). The GA-SCNN facilitates feature extraction and unsupervised sequence labeling, and has been applied to the model system of E. coli expressing recombinant ProA5M protein. By applying model prediction and sequence interpolation techniques, the GA-SCNN significantly expanded the database for glucose, lactate, ammonium ions, and OD600 from 52 to 1302 samples. A comparative analysis using standard regression algorithms has demonstrated the superior predictive performance of the GA-SCNN framework when dealing with a large volume of spectral data without the requirement for preprocessing. Model cross-validation has confirmed high accuracy and robustness in determining coefficients. In addition, a transfer learning strategy has been employed using the OD600 data and limited recombinant protein expression data to develop a prediction model for the target protein. Validation experiments demonstrate good agreement between model predictions and offline results.