pEC3=-0.425(±0.124)*clogP – 0.152(±0.0412)*ETS2 + 6.17 (±1.22) Equation 2
n=21, r2=0.71, r2adj =0.46, p=0.002
As a final exercise, the dataset of 22 compounds was partitioned into a training (N=14) and test set (N=8). As with the work of Roberts et al. , we generated the QSAR model on a set consisting primarily of nonfunctional aldehydes and ketones to avoid confounding effects. A small number of additional exemplars that help cover the full range in pEC3 were also included (Table 2). All compounds were predicted using equation 3. Compounds 2 , 4 and 18 were also predicted using our previously reported model for SNAr domain.[54] Again, a two-parameter model was fitted using the training data resulting in equation 3. The training set explained variance is somewhat lower than observed with the larger combined set (r2=0.40), however the descriptor coefficients are qualitatively similar. Prediction on the test set of compounds show the compounds are quite well ranked (r2=0.49). A noticeable outlier in figure 3 is compound 17 which on further analysis of the structure can also potentially function via the acyl reaction domain.[5] This could account for its low predicted activity from this Schiff-base derived model. When compounds 20 (SNAr domain) and 17 (Acyl domain) are excluded from the test set, r2 of 0.62 is observed.
pEC3 = -0.388(±0.150)*clogP – 0.172(±0.061)*ETS2 + 6.671 (±1.841) Equation 3
Training set (n=14, r2=0.49, r2adj=0.40, p=0.02),
Test set (n=8, r2=0.49), Test set (n=6 (ex 17 & 20 ), r2=0.62)