Counting parameters has become customary in the density functional theory community as a way to infer the transferability of popular approximations to the exchange–correlation functionals. Recent work in data science, however, has demonstrated that the number of parameters of a fitted model is not related to the complexity of the model itself, nor to its eventual overfitting. Using similar arguments, we show here that it is possible to represent every modern exchange–correlation functional approximation using just one single parameter. This procedure proves the futility of the number of parameters as a measure of transferability. To counteract this shortcoming, we introduce and analyze the performance of three statistical criteria for the evaluation of the transferability of exchange–correlation functionals. The three criteria are called Akaike information criterion (AIC), Vapnik–Chervonenkis criterion (VCC), and cross-validation criterion (CVC) and are used in a preliminary assessment to rank 60 exchange–correlation functional approximations using the ASCDB database of chemical data.
The growing generation of data and their wide availability has led to the development of tools to produce, analyze and store this information. Computational chemistry studies and especially catalytic applications often yield a vast amount of chemical information that can be analyzed and stored using these tools. In this manuscript we present a framework that automatically performs a full automated procedure consisting in the transfer of an adsorbate from a known metal slab to a new metal slab with similar packing. Our method generates the new geometry and also performs the required calculations and analysis to finally upload the processed data to an online database (ioChem-BD). Two different implementations have been built, one to relocate minimum energy point structures and the second to transfer transition states. Our framework shows good performance for the minimum point location and a decent performance for the transition state identification. Most of the failures occurred during the transition state searches needed additional steps to fully complete the process. Further improvements of our framework are required to increase the performance of both implementations. These results point to the _avoidhuman_ path as a feasible solution for studies on very large systems that require a significant amount of human resources and in consequence are prone to human errors.
Quantum chemistry must evolve if it wants to fully leverage the benefits of the internet age, where the world wide web offers a vast tapestry of tools that enable users to communicate and interact with complex data at the speed and convenience of a button press. The Open Chemistry project has developed an open source framework that offers an end-to-end solution for producing, sharing, and visualizing quantum chemical data interactively on the web using an array of modern tools and approaches. These tools build on some of the best open source community projects such as Jupyter for interactive online notebooks, coupled with 3D accelerated visualization, state-of-the-art computational chemistry codes including NWChem and Psi4 and emerging machine learning and data mining tools such as ChemML and ANI. They offer flexible formats to import and export data, along with approaches to compare computational and experimental data.
We have performed a large-scale evaluation of current computational methods, including conventional small-molecule force fields, semiempirical, density functional, ab initio electronic structure methods, and current machine learning (ML) techniques to evaluate relative single-point energies. Using up to 10 local minima geometries across ~700 molecules, each optimized by B3LYP-D3BJ with single-point DLPNO-CCSD(T) triple-zeta energies, we consider over 6,500 single points to compare the correlation between different methods for both relative energies and ordered rankings of minima. We find promise from current ML methods and recommend methods at each tier of the accuracy-time tradeoff, particularly the recent GFN2 semiempirical method, the B97-3c density functional approximation, and RI-MP2 for accurate conformer energies. The ANI family of ML methods shows promise, particularly the ANI-1ccx variant trained in part on coupled-cluster energies. Multiple methods suggest continued improvements should be expected in both performance and accuracy.