16

I have a table with around 2 million records in it. I create a spatial index, using the defaults other than the bounding box. I've been noticing that some queries are extremely fast, and some are extremely slow. The determining factor appears to the size of the polygon used in the query.

On larger search areas, using WITH(INDEX(SIX_FT5)) slows down the query considerably (from 0 seconds, to 15+ seconds). On smaller search areas, the exact opposite it true.

Here's some of the queries I'm testing with:

Fast:

SELECT TOP(1000) * FROM [FT5] WHERE (shape.STIntersects(geometry::STGeomFromText('POLYGON ((-133462.805381701 -668610.241000959, 2934415.68824241 -668610.241000959, 2934415.68824241 2200521.65831815, -133462.805381701 2200521.65831815, -133462.805381701 -668610.241000959))', 2264)) = 1) 

Slow:

SELECT TOP(1000) * FROM [FT5] WITH(INDEX(SIX_FT5)) WHERE (shape.STIntersects(geometry::STGeomFromText('POLYGON ((-133462.805381701 -668610.241000959, 2934415.68824241 -668610.241000959, 2934415.68824241 2200521.65831815, -133462.805381701 2200521.65831815, -133462.805381701 -668610.241000959))', 2264)) = 1) 

Anyone know what's going on here?

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
mdm20
  • 261
  • 1
  • 4
  • I was just going through something similar http://dba.stackexchange.com/questions/61289/how-to-take-advantage-of-multiple-spatial-indices the other day... I wasn't generating a polygon from text, but was intersecting points and polygons... I specified to use the spatial index on the point, which had great speed results. I then tried using the spatial index on the polygon, and had very poor performance...which seems like the exact opposite of your problem! – DPSSpatial_BoycottingGISSE Mar 21 '14 at 22:10
  • 4
    If you think about it, changing the size of the search envelope should have a significant impact on the query -- the more rows that are returned through an index, the slower the response. At some point, it becomes faster to full table scan and throw away rows based on envelope. I would suggest that you spend more time with the spatial index options, since you probably have room for optimization of the index. – Vince Mar 21 '14 at 22:18
  • Are your records representing points? That was not stated. Also, can you publish the create index syntax that you used? Was it AutoGrid? – gischimp Oct 20 '15 at 18:19
  • I have used 'Geography Auto Gird' and 'Cells per Object' = 4000. Intersected 110+ million points with ~45K polygons. – Michael Mar 05 '17 at 11:16
  • It's been a while since I did this, but I tried every possible index creation syntax possible and never got good results. I was querying to find polygons, not points. In the end, I ended up using geometry stored using 4326 projection and had better results. I stored the original geometry also and would return that to the user. – mdm20 Mar 07 '17 at 17:47
  • 1
    Another thing you have to remember is that an intersect is a complex operation, first it must look if the elements bound intersect, relatively fast operation through indexes, but then for each item that matches, it must calculate if every single item actually intersects, which is yet another, more costly operation, which becomes even more costly as the polygons are more complex and/or more numerous. – AKK2 Apr 25 '17 at 12:24
  • Without an index, SQL can simply use the bounding box as a first pass, which is equally costly regardless of query polygon size (two sorts). With an index, a large polygon will cover more branches of the index so require separate sort operations for each. The actual intersect (as per AKK2's point) would not be affected by the index I think but of course a larger bounding box means more points to check. – AnserGIS May 19 '17 at 12:18
  • Depending on your setup you may be better off creating a WKB Geometry to use in your query instead of the WKT, check the output of explain to see if MSSQL is optimizing that much anyway. Also you should make the index with the bounding box and run a test to determine if it is required. Also Explain may indicate what index if any will help – flurbius Dec 27 '17 at 21:04

1 Answers1

1

As commented by @Vince:

If you think about it, changing the size of the search envelope should have a significant impact on the query -- the more rows that are returned through an index, the slower the response. At some point, it becomes faster to full table scan and throw away rows based on envelope. I would suggest that you spend more time with the spatial index options, since you probably have room for optimization of the index.

PolyGeo
  • 65,136
  • 29
  • 109
  • 338