Space weather indices are used to drive forecasts of thermosphere density, which directly affects objects in low-Earth orbit (LEO) through atmospheric drag force. A set of proxies and indices (drivers), F10.7, S10.7, M10.7, and Y10.7 are used as inputs by the JB2008 thermosphere density model. The United States Air Force (USAF) operational High Accuracy Satellite Drag Model (HASDM), relies on JB2008, and forecasts of solar drivers from a linear algorithm. We introduce methods using long-short term memory (LSTM) model ensembles to improve over the current prediction method as well as a previous univariate approach. We investigate the usage of principal component analysis (PCA) to enhance multivariate forecasting. A novel method, referred to as striped sampling, is created to produce statistically consistent machine learning data sets. We also investigate forecasting performance and uncertainty estimation by varying the training loss function and by investigating novel weighting methods. Results show that stacked neural network model ensembles make multivariate driver forecasts which outperform the operational linear method. When using MV-MLE (multivariate multi-lookback ensemble), we see an improvement of RMSE for F10.7, S10.7, M10.7, and Y10.7 of 17.7%, 12.3%, 13.8%, 13.7% respectively, over the operational method. We provide the first probabilistic forecasting method for S10.7, M10.7, and Y10.7 . Ensemble approaches are leveraged to provide a distribution of predicted values, allowing an investigation into robustness and reliability (R&R) of uncertainty estimates. Uncertainty was also investigated through the use of calibration error score (CES), with the approach providing an average CES of 5.63%, across all drivers.