1.0 Introduction

Skin sensitization is a commonly observed occupational health issue which arises from an immunological allergic response. Skin sensitizers are chemical substances that elicit an allergic response after exposure to the skin, leading to allergic contact dermatitis (ACD).[1] It has been reported that between 15-20% of the general population will suffer sensitization over the course of their lives. The disease is a significant regulatory health concern and has resulted in European Union legislation in the form of the Registration, Evaluation, Authorization and Restriction of Chemicals (REACH). This legislation requires that the skin sensitization potential of all chemical substances manufactured or imported at level of one ton per annum must be assessed. A further goal of REACH is to increase the use of nonanimal models for chemical assessment.[2, 3]
Skin sensitization arises from the reaction of chemical sensitizer with skin proteins triggering an immune response.[4, 5] A range of different techniques are available to assess the skin sensitization of chemicals. These including in-vivo, in-vitro, in-chemico and in-silico methods have been developed.[6-13] As a result of ethical standards set by REACH legislation there has been an increasing push away from in vivo models such as the gold standard in vivo murine Local Lymph Node Assay (LLNA),[7] towards in vitromethods, such as KeratinoSensTM assay,[14] and in chemicoalternatives, such as peptide depletion assays.[13] Hoffman et al.analyzed 128 compounds with a range of sensitization endpoints find that the LLNA assay shows ~75% concordance between human results and the LLNA assay, while that for the KeratinoSensTM assay was roughly comparable.[15] Natsch et al. [8] reported that the latterin vitro approach showed a 60% concordance with the LLNA methods for a set of 312 chemicals. Hoffman[16] and others[8, 17] report that the most effective strategy to predict skinsensitization potential is to employ a multiple non-animal methods. The former reports that incorporation of essentially orthogonal test strategies comprising in vitro , in chemicoand in silico inputs demonstrated the best overall performance, equivalent or superior to the LLNA assay on their curated set of 128 datapoints.[16]
In silico methods are desirable alternatives to in vivomodels since a prediction on an unknown chemical can be made from its chemical structure alone. While this means the methods generally cost resource and time efficient, they are generally of lower accuracy than their experimental alternatives. In silico models can range from similarity or substructural methods[5, 18, 19] that allow the identification of like-molecules with experimental data (read-across) or statistical models that can relate 2D chemical descriptors to a qualitative or quantitative prediction of activity. [20-22] Methods TIMES, Toxtree, Derek Nexus etc. [17, 23, 24] have proved useful in compound assessment in their own right,[21, 25] and as part of multi-tiered testing strategies with in silico models as the first tier approach.[21, 26]
A number of different statistical models that relate chemical properties to the degree of sensitization have been reported in the literature. Guidelines that all in silico models must meet are: (a) a defined endpoint, (b) an unambiguous QSAR model, which is (c), mechanistically interpretable. In addition the model must have (d) predictivity that is fit for purpose and (e) a defined domain of applicability.[5, 27] Notable models include the relative alkylation index (RAI) of Roberts et al.[28], models built on individual chemical domain basis[12, 29-32] or global basis.[21, 22, 33, 34] Global models are generally desirable due to their greater applicability domain,[35] however in many cases focusing a QSAR model on individual chemical classes (i.e. Schiff bases, Michael acceptors, SN1/SN2, SNAr, Acyl,etc. )[12] we can obtain better “local” performance.[36] These mechanistically interpretable models can offer increased confidence over black box models which may be important in a regulatory situation.
There has been a general trend towards more information rich 3D specific, or use of quantum chemical descriptors in QSAR modelling studies associated with ligand bioactivity.[37-45] This includes the incorporation of dynamical effects via descriptors derived from MD simulations[46-48] and interaction energies and conformational energies via quantum mechanics (QM).[49-53] and chemical reactivity.[54, 55] Indeed, these trends towards more information rich descriptors have been observed in studies related to skin sensitization prediction given that chemical reactivity can be encoded much effectively with quantum chemical derived descriptors than those that are empirical derived. For example, Miller et al. [34] have used semi-empirical HOMO-LUMO energies for their QSAR studies, Enoch et al. used density functional theory energies of key reaction intermediate as surrogates for the rate determining barriers of Michael acceptors [55] while Promkatkaew et al.fully profiled all intermediates and transition states in the reaction mechanism of SNAr chemicals.[54] Additional efforts have been spent investigating ligand conformational effects on sensitization - given that molecules are not perfectly described by a single conformation. Yu et al. have used 4D fingerprints in their studies[33] while Kostal et al. have incorporated Monte Carlo conformational sampling in their hybrid QSAR models with good results.[56]
In this work we apply quantum chemical methods to rationalize the sensitization potential of chemicals in the Schiff base (SB) domain (Scheme 1). Roberts et al. have previously reported a quantitative mechanistic model to predict the LLNA pEC3 using the Taft σ* values and logP.[32] We were interested in expanding on this work by (a) employing a more diverse datasets to cover a wider range of SB functional groups as well as (b) investigate whether QM derived estimates of chemical reactivity could prove useful. In our previous work we showed that DFT derived barrier estimates did indeed perform comparably well for the SNAr domain.[54] A key advantage of such methods are that prediction can be made for functional groups where the experimental Taft σ* values are not readily available. To this end we have collected 22 SB base chemicals covering aliphatic and aromatic aldehydes and ketones, 1,2 diones and 1,3 diones, expanding considerably the domain of applicability of the model over the previous study (11 out of 16 were aliphatic aldehydes). The full reaction energy profile leading to the formation of the 30 possible SB products for the 22 compounds, including 8 chemicals where more than one product is possible. The relationship between the rate determining (RDS) barrier to reaction and the LLNA pEC3 was then assessed. Finally we constructed a two parameter quantitative molecular model (QMM) using only the RDS barrier and the computed logP, the latter being another property identified as being important for skin sensitization of SBs.[32]