Objective: Exercise monitoring with low-cost wearables could improve the efficacy of remote physical-therapy prescriptions by tracking compliance and informing the delivery of tailored feedback. While a multitude of commercial wearables can detect activities of daily life, such as walking and running, they cannot accurately detect physical-therapy exercises. The goal of this study was to build open-source classifiers for remote physical therapy monitoring and provide insight on how data collection choices may impact classifier performance. Methods: We trained and evaluated multi-class classifiers using data from 19 healthy adults who performed 37 exercises while wearing 10 inertial measurement units on the wrist, pelvis, thighs, shanks, and feet. We investigated the effect of sensor density, location, type, sampling frequency, output granularity, feature engineering, and training-data size on exercise-classification performance. Results: Exercise groups (n=10) could be classified with 96% accuracy using a set of 10 inertial measurement units (IMUs) and with 89% accuracy using a single pelvis-worn IMU. Multiple sensor modalities (i.e., accelerometers and gyroscopes), high sampling frequencies, and more data from the same population did not improve model performance, but in the future data from diverse populations and better feature engineering could. Conclusions: Given the growing demand for exercise monitoring systems, our sensitivity analyses, along with open-source tools and data, should reduce barriers for product developers, who are balancing accuracy with product formfactor, and increase transparency and trust in clinicians and patients.