Image perturbation is the promising technique to assess radiomic feature repeatability without test-retest imaging. However, whether it can achieve the same effect on model reliability enhancement as test-retest imaging is unknown. This study aimed to compare radiomic model reliability based on repeatable features determined by image perturbation and test-retest imaging. A 191-patient public breast cancer dataset with 71 test-retest scans was used with pre-determined 117 training and 74 testing samples. We collected apparent diffusion coefficient images and the manually segmented tumor structures for radiomic feature extraction and pathological complete response record for model prediction. Random translations, rotations, and contour randomizations were performed on the training images, and intra-class correlation coefficient (ICC) was used to quantify feature repeatability. After removing volume correlated features, multiple ICC thresholds were applied for repeatable feature filtering, and separate logistic-regression models were developed using 5 most relevant and independent features. We evaluated model reliability in both generalizability and robustness, which were quantified by training and testing area under the receiver operating characteristic curve (AUC) and prediction ICC under perturbation and test-retest. Higher testing performance was found at higher ICC thresholds, but it dropped significantly at ICC=0.95 for the test-retest model. Similar optimal reliability can be achieved with testing AUC = 0.76-0.77 and prediction ICC>0.9 at the ICC threshold of 0.9. It is recommended to include feature repeatability analysis using image perturbation in any radiomic study when test-retest is not feasible, but care should be taken when deciding the optimal feature repeatability criteria.