Background: Automatic facial landmark localization is an essential component in many computer vision applications, including video-based detection of neurological diseases. Machine learning models for facial landmarks localization are typically trained on faces of healthy individuals, and we found that model performance is inferior when applied to faces of people with neurological diseases. Fine-tuning pre-trained models with representative images improves performance on clinical populations significantly. However, questions related to the characteristics of the database used to fine-tune the model and the clinical impact of the improved model remain. Methods: We employed the Toronto NeuroFace dataset – a dataset consisting videos of Healthy Controls (HC), individuals Post-Stroke, and individuals with Amyotrophic Lateral Sclerosis performing speech and non-speech tasks with thousands of manually annotated frames - to fine-tune a well-known deep learning-based facial landmark localization model. The pre-trained and fine-tuned models were used to extract landmark-based facial features from videos, and the facial features were used to discriminate clinical groups from HC. Results: Fine-tuning a facial landmark localization model with a diverse database that includes HC and individuals with neurological disorders resulted in significantly improved performance for all groups. Our results also showed that fine-tuning the model with representative data greatly improved the ability of the subsequent classifier to classify clinical groups vs. HC from videos. Conclusions: Using a diverse database for model fine-tuning might result in better model performance for HC and clinical groups. We demonstrated that fine-tuning a model for landmark localization with representative data results in improved detection of neurological diseases.