Biased, incomplete models are often used for forecasting states of complex dynamical systems by mapping an estimate of a “true’ initial state into model phase space, making a forecast, and then mapping back to the “true’ space. While advances have been made to reduce errors associated with model initialization and model forecasts, we lack a general framework for discovering optimal mappings between the reference dynamical system and the model phase space. Here, we propose using a data-driven approach to infer these maps. Our approach consistently reduces errors in the Lorenz-96 system with an imperfect model constructed to produce significant model errors compared to a reference configuration. Optimal pre- and post-processing transforms leverage “shocks’ and “drifts’ in the imperfect model to make more skillful forecasts of the reference system. The implemented machine learning architecture using neural networks constructed with a custom analog-adjoint layer makes the approach generalizable to numerous applications.