Recent advances in Earth observation data and computing ability create exciting opportunities for national and global studies of human impacts to water resources. But, with a lack of complete databases of artificial levees, there remains a need to better understand how artificial levees impact floodplain extent at regional and larger scales. Here, we estimate river-floodplain disconnection in the contiguous United States using an incomplete artificial levee database, machine learning algorithms, and hydrogeomorphic floodplain delineation models. We tested different topographic, land use, and spatial variables with different machine learning techniques in a case study of seven geographically diverse HUC8 basins before applying the technique at the national scale. We found that a parsimonious random forest model without topographic variables was 97% accurate. When applied to areas within a national 100-year hydrogeomorphic floodplain, the model indicated the potential for more than 180,000 km of undocumented artificial levees, meaning that the National Levee Database (NLD) is about 20% complete. More than 62% of potential levees are concentrated in the Upper and Lower Mississippi and Missouri basins. The stream order distribution of potential and NLD levees are similar; however, potential levees are primarily located along stream orders 3 and 6 while the NLD locations are along stream orders 2, 3 and 4. Using this, we explored the national impacts of artificial levees on floodplain extent by comparing two hydrogeomorphic floodplains based on (1) an unmodified USGS 1 arc second DEM and (2) a modified DEM with known and potential levees erased from the topography. We found that the overall impact of artificial levee removal was to shift the location of flooding. Over 30% of the CONUS 100-year floodplain was cultivated or developed land use.