6

I'm trying to load a large set of polygons into GRASS with "v.in.ogr" and am running out of memory at the "Break Boundaries" stage. I'm wondering if I can break into steps all of the different things that v.in.ogr does to a data set when it is run without the "-c" flag? For example, I know that it must run the equivalent of "v.clean tool=bpol" and "v.clean tool=rmdupl", but I'm not exactly sure what to run after those 2 to complete the polygon cleaning that normally happens. I figure that I might be able to save some memory if I don't have to run all the operations in one go.

Does this makes sense?

Thanks.


Ok, so it was like digging for gold to find this, but in a presentation given by some people who seem to know GRASS pretty well, they said that the following steps were needed to clean polygon geometry after import:

Cleaning procedures for area import

  • break polygons
  • remove duplicates
  • break boundaries
  • remove duplicates
  • clean boundaries at nodes
  • change dangles to lines
  • remove bridges

Link to presentation here: http://geostat-course.org/system/files/GRASS_geostat_landau_2011_intro.pdf


Wait, no... there's more to this apparently. I tried those steps listed above (or attempted to at least). I think I'm worse off now than when I started.

v.in.ogr -c
Importing map 52 features...
Number of boundaries: 52
Number of centroids: 52
Number of areas: 42
Number of isles: 42
Number of incorrect boundaries: 10
Number of centroids outside area: 10

1. break polygons
  v.clean type=boundary tool=bpol
Number of boundaries: 270
Number of centroids: 52
Number of areas: 2
Number of isles: 4
Number of incorrect boundaries: 268
Number of centroids outside area: 50

2. remove duplicates
  v.clean type=boundary tool=rmdupl
Number of boundaries: 190
Number of centroids: 52
Number of areas: 40
Number of isles: 4
Number of incorrect boundaries: 126
Number of centroids outside area: 12

3. break boundaries
  v.clean type=boundary tool=break
Number of boundaries: 5147
Number of centroids: 52
Number of areas: 151
Number of isles: 0
Number of incorrect boundaries: 5132
Number of centroids outside area: 40
Number of areas without centroid: 139

4. remove duplicates
  v.clean type=boundary tool=rmdupl 
Number of boundaries: 4803
Number of centroids: 52
Number of areas: 153
Number of isles: 0
Number of incorrect boundaries: 4781
Number of centroids outside area: 38
Number of areas without centroid: 139

5. clean boundaries at nodes
  (No idea what-the-flip this means.  Clean with what?  Do it by hand?)

6. change dangles to lines
   v.clean type=boundary tool=chdangle thresh=-1
Number of boundaries: 4803
Number of centroids: 52
Number of areas: 153
Number of isles: 0
Number of incorrect boundaries: 4781
Number of centroids outside area: 38
Number of areas without centroid: 139

7. remove bridges
  v.clean type=boundary tool=rmbridge in=eq1_6 out=eq1_7
Number of boundaries: 4420
Number of centroids: 52
Number of areas: 153
Number of isles: 22
Number of incorrect boundaries: 4352
Number of centroids outside area: 38
Number of areas without centroid: 139

Guess I'm off to read the source for v.in.ogr now.

lagerratrobe
  • 981
  • 6
  • 16

2 Answers2

1

Answer:

The following steps are what v.in.ogr does to clean polygons that are loaded with the "-c" flag:

  • v.clean tool=snap type=boundary thresh= 0.000001
  • v.clean tool= bpol type=boundary
  • v.clean tool=rmdupl type=boundary
  • v.clean tool=break type=boundary
  • v.clean tool=rmdupl type=boundary
  • v.clean tool=rmsa type=boundary
  • v.clean tool=chdangle thresh=-1.0 type=boundary

This was as far as I needed to go in order to clean the topology of my data set. There are some additional steps that you may need if your data set needs further cleaning. I leave it to the reader to convert these into GRASS v.clean commands.

Vect_remove_dangles(&Tmp, GV_BOUNDARY, -1.0, NULL);
Vect_chtype_bridges(&Tmp, NULL);
Vect_remove_bridges(&Tmp, NULL);
Vect_merge_lines(&Tmp, GV_BOUNDARY, NULL, NULL);
Vect_build_partial(&Tmp, GV_BUILD_ATTACH_ISLES);
lagerratrobe
  • 981
  • 6
  • 16
0

I will not answer directly to your question, but an alternative to split the import/clean process is to add a temporary swap memory to your system. See the wiki page on GRASS memory issues.

simo
  • 8,580
  • 1
  • 28
  • 56