In this work we address the problem of rigorously evaluating the performances of a inertial navigation system under design in presence of multiple alternative choices. We introduce a framework based on Monte-Carlo simulations in which a standard extended Kalman filter is coupled with realistic and user-configurable noise generation mechanisms and attempts to recover a reference trajectory from noisy measurements. The evaluation of several statistical metrics of the solution, aggregated over hundreds of realizations, gives a reasonable estimate of the expected performances of the system in real-world conditions and allow the user to operate the choice between alternative setups. To show the generality of our approach, we consider an example application to the problem of stochastic calibration. Two competing stochastic modeling techniques, namely, the widely popular Allan variance linear regression, and the emerging generalised method of wavelet moments are rigorously compared in terms of the framework defined metrics and in multiple scenarios. We find that the latter provides substantial advantages and should be preferred, at least for certain classes of inertial sensors. Our framework allows to consider a wide range of problems related to the quantification of navigation system performances such as, for example, the robustness of an INS with respect to outliers or other modeling imperfections. While real world experiments are essential to assess to performance of new methods they tend to be costly and are typically unable to lead to a sufficient number of replicates to evaluate, for example, the correctness of estimated uncertainty. Therefore, our method can bridge the gap between these experiments and pure statistical consideration as done, for example, in the stochastic calibration literature.