A parallel shallow water flow model is introduced in this paper. The explicit-time finite volume approach is adopted to solve the 2D shallow water equations on an unstructured triangular mesh. The proposed scheme is second-order accurate in temporal and spatial terms using the two-stage Runge-Kutta and the monotone upwind scheme for conservation law (MUSCL) methods, respectively. Based on Message Passing Interface (MPI) and OpenACC, a multi-GPU model is presented with the METIS library to produce the domain decomposition. A CUDA-aware MPI library through GPUDirect for peer-to-peer (P2P) transfer between two GPUs and overlapping computation and MPI communication are used to speed up MPI memory exchange and the performance of the code. A 2D circular dam break test with wet and dry downstream beds and grid resolutions of about 2 million cells is considered to verify the accuracy of the code, and good results were achieved compared to the numerical simulations of published studies. Compared with the multi-CPU version of the 6-core CPU, maximum speedups of 56.18 and 331.51 were obtained using the single-GPU and multi-GPU versions, respectively. Results indicate that acceleration performance improves as the mesh resolution increases.