The VLSI architectures for stack or priority queue (PQ) are required in the implementation of stack or sequential decoders of polar codes. Such type of decoders provide good BER performance keeping complexity low. Extracting the best and the worst paths from PQ is the most complex operation in terms of both latency and complexity, because this operation requires full search along priority queue. In this work we propose a low latency and low complexity parallel hardware architecture for PQ, which is based on the systolic sorter and simplified sorting primitives. The simulation results show that just small BER degradation is introduced compared to ideal full sorting networks. Proposed PQ architecture is implemented in FPGA, the synthesis results are presented for all components of PQ.