This paper proposes PixelPrune, an approach to address two primary challenges in artificial intelligence of things (AIoT) vision systems: (1) the energy-intensive analog-to-digital converters (ADCs) required in the sensing unit for converting analog pixel arrays to digital tensors, and (2) the high data transfers between the sensing unit and computing unit. Our proposed solution involves the implementation of an in-sensor binary segmentation model on analog memristive crossbars to identify the important pixels and prune out the background information. Additionally, we propose a data transfer scheme that adaptively selects between dense and sparse data transfer formats based on the sparsity ratio measured from the segmentation mask obtained by the segmentation model. Our results demonstrate that the proposed object detection system achieves significant energy savings along with a considerable up to 95% reduction in data transfer, all while maintaining high accuracy.