The proliferation of computing applications in Edge devices emphasizes the need for efficient and accurate Deep Learning (DL) models, especially in safety applications like Driver Distraction Detection (DDD). However, DL's substantial computational requirements hinder deployment in resourceconstrained environments like vehicles. This paper introduces a differentiable architecture search method for optimal and resource-conscious neural network architecture design. We integrate edge-related constraints in a multi-objective function. We investigate Pareto optimality to explore a diverse set of solutions that cover a spectrum of trade-offs and objectives, rather than a single, narrowly optimized solution. We specifically tailor the model design to target a predetermined computational budget in terms of inference time and model size. The proposed method has been evaluated for DDD using 2 benchmark datasets, namely, SFD and AUCD2, and deployed on a spectrum of devices (workstation, embedded system, and mobile devices). We obtained detection accuracies of 98.17% and 95.80% on SFD and AUCD2, respectively, while significantly reducing model sizes to 0.25 MB and 0.36 MB and inference latency to 3 ms and 4 ms on Nvidia Jetson Xavier NX. Additionally, we achieve almost an order of magnitude fewer parameters (0.06M and 0.09M) compared to state-of-the-art.