5

There are concerns about the ability of ArcGIS 10 to fulfill a requisite of mine.

ET Geowizards has been tried, although it doesn't have the same capability of ArcGIS e.g. I cannot aggregate all of the points I have with ET, to the scale I have them plotted.

There is a memory leak, which means looping through 700 objects, performing:

  1. Agg Points.
  2. Buffer.
  3. Add Field.
  4. Update cursor.

Starts off taking 5-9 secs per object, and continues on to 2 mins per (similarly sized) object.

In SP2, it appears AggPoints no longer works to create an FC on the fly. There is more, but it's too long a list to compile!

Code, simplied with no buffer, add field or cursor. geom is a collection or arcpy points

def createGeom(geom, scratchDB):
    filetime = (str(time.time())).split(".")
    outfile = "fc" + filetime[0]+filetime[1]
    outpath = scratchDB + "tmpV.gdb/Polygon/"  
    outFeatureAggClass = outpath+outfile +"_Agg"
    arcpy.AggregatePoints_cartography(geom, outFeatureAggClass,"124000 meters")

Seems to be that the size of the file geodatabase we're writing to is one of the main issues; I think the problem is that performance degrades as the local file geodatabase fills up - significantly.

Any ideas how to tune local file geodatabase?

Hairy
  • 4,843
  • 1
  • 26
  • 45
  • can you be more specific or give some examples highlighting your concerns with ArcGIS 10? – artwork21 Aug 01 '11 at 13:00
  • Comment added. I would love to do a concave hull, but the only code kicking around isn't suitable for the grid of points I am using, as it's simply too large. I am using world grids of around 60km each. – Hairy Aug 01 '11 at 13:47
  • downvote because the Q is too broad. I think people are upvoting on the title alone. It's a theme many are interested in (myself included!), but it doesn't make a good question. – matt wilkie Aug 02 '11 at 16:50
  • 1
    In terms of cursors, I have found ArcGIS to be a nightmare when working in Python. C# has been much more amenable to the task of large cursor processes, though. – Nathanus Aug 02 '11 at 17:55
  • The cursor is small, and I am using it to update one object. I have also removed the cursor to show it isn't that, which it isn't. Calling arcpy causes the leaks. – Hairy Aug 03 '11 at 06:58
  • downvote rescinded, with the new edit it's much easier to see what is being attempted. +1 for that. – matt wilkie Aug 03 '11 at 16:36
  • Let me see if I have this straight now: you have a python script processing a list of 700 point feature classes, sending each one through createGeom, and that for the first few fc's it takes seconds to process each one, but get's progressively slower? – matt wilkie Aug 03 '11 at 17:05
  • in line with the previous idea, I'm feeding a point fc of ~3900 records though createGeom repeatedly. (for i in range(1,700); createGeom(geom,scratchDB)) In the 100 iterations so far, each one takes 18-19secs. So it looks like, on my machine anyway, the leak is not in the call to arcpy.AggregatePoints. – matt wilkie Aug 03 '11 at 18:01
  • here's the test script: http://pastebin.com/5tzappDn – matt wilkie Aug 03 '11 at 18:10
  • I recommend changing the title of this to something like "memory leak in ArcGIS aggregate points/buffer/cursor?", and put "Open source alternative to ArcGIS aggregate points & buffer?" in a separate question. – matt wilkie Aug 03 '11 at 18:17
  • Matt, what ver of ArcGIS are you on, and what platform? I am on XP SP3, ArcGIS 10 SP1 (incidently, Agg Points doesn't work in SP2). – Hairy Aug 04 '11 at 07:00
  • Matt, what I have, is a collection of points. There could be 4 points Min, but could be thousands of points. In the main, they are similar in size, 'with no data' creating 4 points, but 'with data' having an N number of points. – Hairy Aug 04 '11 at 07:02
  • Arcgis 10 sp1, win7 pro x64. If AggPoints has stopped working I'd file a bug! I let the test script run 250+ iterations before calling it quits; there was never more than a 1sec variation in processing time. – matt wilkie Aug 05 '11 at 03:27
  • No worries, I tested it on my win 7 x64 machine and it did leak. Slightly concerning, hwoever, is that if I use the term gc.enable() then gc.set_debug(gc.DEBUG_LEAK), it speeds up dramatically, but still leaks? I think this will be with the COM objects holding onto resources, I really do. I have filed a bug under SP2, but it is also still in 10.1 too. Happy days hey! – Hairy Aug 05 '11 at 05:53

1 Answers1

3

If you show the simplest possible form of the code, it might benefit from using a dictionary instead of cursor, or in-memory workspace, or a change the workflow (for example aggregate & buffer then cursor instead of cursor then aggregrate & buffer [ref]), or... In any case, start here: Performance of ArcGISScripting and large spatial data sets

matt wilkie
  • 28,176
  • 35
  • 147
  • 280
  • Matt, I though I had made it clear the workflow? The cursor only ever updates one record, the record it has created with the agg points and buffer; I cannot update the data, without first adding the fields, and creating the dataset, which I am doing with the agg points, and buffer. I don't find this sueful at all matt, to be honest. – Hairy Aug 03 '11 at 06:56
  • Matt's suggestions are sound. When you say there is a memory leak, how much memory is it actually using? If you are using an in-memory workspace, take care to either overwrite or delete anything you create there when it is no longer needed so that you don't fill up your RAM and start hitting the pagefile instead. We would need to see how you are calling your createGeom in a loop to say whether/how that could be improved. – blah238 Aug 03 '11 at 07:55
  • Also are you hitting an SDE feature class in this script? I ran into an actual memory leak in a very specific situation involving a direct connection to an SDE feature class which I described here. It's possible, though unlikely, that the same leak is happening here. – blah238 Aug 03 '11 at 08:01
  • I think you can see no in-memory workspace is being used. It's using a set of points, to create a featureClass. They are all being written to a local FGDB. I said Matt wasn't being useful, as I had described the workflow clearly, before he wrote his post. He also got the order wrong using the cursor. – Hairy Aug 03 '11 at 12:27
  • @Hairy apologies for lack of usefulness. It appears there's enough of that to flow in both directions, as what was described as a clear workflow for you was anything but for me. ;-) In any case, on with the troubleshooting! (in additional comments) – matt wilkie Aug 03 '11 at 17:54
  • No worries Matt, I was perhaps alittle harsh, but couldn't see your reasoning with the pseudo code I had put up. Thansk for your comments. – Hairy Aug 04 '11 at 06:59
  • @Matt - Can you edit your answer so I can upvote it? – Hairy Aug 04 '11 at 07:05
  • @Hairy, no harm no foul. It's hard to read tone in plain text. ;-) I would change it if I understood what you thought was wrong with it, other than something to do with the cursor. You have "4 - update cursor", but no "0 - create cursor" step. My suggestion was (meant to say) that perhaps moving the position of the cursor from outside to inside the loop, or vice versa, would solve the leak (or maybe not even use a cursor).. – matt wilkie Aug 05 '11 at 03:39
  • The 4 was create the update cursor; just worded wrong for you I guess. I had to update the fields I created, so I agg, then buffer, then add fields, then create the update cursor, to add data to the new fields. There is always only one record created, and the cursor is shut down and deleted when used. No matter what I do, there is a leak. Its acutely frustrating! Thanks for your help. Upvoted. – Hairy Aug 05 '11 at 05:55