Multivariate time series generative models mainly focus on generating time series from complete data, while few works generate complete multivariate time series from incomplete data, which is challenging to capture the feature correlation and temporal dependence simultaneously, due to the missing features and time segments. Missing values disrupt the feature correlation and temporal dependence learned by existing models, which deteriorates the distribution of missing values, resulting in low utility of synthetic data. Moreover, these methods capture feature correlation and temporal dependence separately, causing the joint distribution deviation of feature and time dimensions, thus reducing the synthetic data fidelity. In this work, we present a conditional diffusion model TimeDiff to generate time series with high fidelity and utility. This method randomly masks possible missing values in observed data as condition to conduct self-supervised learning of the missing value distribution. Meanwhile, uniting the feature and time dimensions, a multi-dimension temporal-feature network is proposed to capture the correlation of time variation among features. Experiments on real-world time series datasets demonstrate that TimeDiff reaches state-of-the-art results for generating high-utility multivariate time series from incomplete data with up to 80% missing rates.