Forecasting the impacts of changing climate on the phenology of plant populations is essential for anticipating and managing potential ecological disruptions to biotic communities. Herbarium specimens enable assessments of plant phenology across broad spatiotemporal scales. However, specimens are collected opportunistically, and it is unclear whether their collection dates—used as proxies of phenology—are closest to the onset, peak, or termination of a phenophase, or whether sampled individuals represent early, average, or late occurrences in their populations. Despite this, no studies have assessed whether these uncertainties limit the utility of herbarium specimens for estimating the onset and termination of a phenophase. Using simulated data mimicking such uncertainties, we evaluated the accuracy with which the onset and termination of population-level phenological displays (in this case, of flowering) can be predicted from natural-history collections data (in the absence of other biases not evaluated here), and how attributes of the flowering period of a species and temporal collection biases influence model accuracy. Estimates of population-level onset and termination were highly accurate for a wide range of simulated species’ attributes, but accuracy declined among species with longer individual-level flowering duration and when there were temporal biases in sample collection, as is common among the earliest and latest-flowering species. The amount of data required to model population-level phenological displays is not impractical to obtain; model accuracy declined by less than 1 day as sample sizes rose from 300 to 1000 specimens. Our analyses of simulated data indicate that, absent pervasive biases in collection and if the climate conditions that affect phenological timing are correctly identified, then specimen data can predict the onset, termination, and duration of a population’s flowering period with similar accuracy to estimates of median flowering time that are commonplace in the literature.