2

I'm trying to subtract a very large set of features (water areas) from another, relatively small set of features (land area).

In principle, I should be able to do this in QGIS using Vector > Geoprocessing Tools > Difference, but as the very large set of water features is very large (a 1.7GB shapefile), this is taking... a while. (Given that the progress bar on running a small subset of it has not budged in 14 hours, I am reasonably convinced it has crashed.)

What's the best way of doing this? Can I do it with gdal or ogr or a python library? Will it be more likely to work if I merge all the separate water features into a single feature? (Currently there are about 2.2 million, and I don't actually care about the water features, only the land.)

Joseph
  • 75,746
  • 7
  • 171
  • 282
futuraprime
  • 769
  • 6
  • 14
  • It is likely that it works better with a single feature because you have a loop for each feature with this tool for each combination of features from both files. Not entirly sure as it might be that it loops over the feature-parts the same way. But its worth a try. The other features are not that different, so the general outcome will be a long process. – Matte Apr 14 '16 at 14:07

1 Answers1

2

I would suggest dissolving (Vector > Geoprocessing Tools > Dissolve) your water area shapefile to a single feature (as mentioned by @Matte) before running the Difference tool.

You could do this inside QGIS but since it's a large file, I would probably suggest using ogr in the OSGeo4W Shell. Example:

ogr2ogr -f "ESRI Shapefile" dissolved.shp input.shp -dialect sqlite -sql "select ST_union(Geometry),common_attribute from input GROUP BY common_attribute"

Then with the output, you can run the Difference tool.

Joseph
  • 75,746
  • 7
  • 171
  • 282
  • I'm running this now, but it's also taking a very long time (not especially surprising). Is there a way to run it with a progress bar or at least a verbose mode so I could see that it's still doing something? (I didn't see anything obvious in the ogr2ogr docs.) – futuraprime Apr 15 '16 at 09:45
  • @futuraprime - Not surprising at all =). Try adding -progress after calling ogr2ogr (e.g. ogr2ogr -progress -f "ESRI Shapefile"... as mentioned in this post. – Joseph Apr 15 '16 at 09:53