10

I loaded the German OSM dataset into the pgrouting DB by using osm2po 4.7.7. Everything works fine i have osm2po set up via it's config and it's working like a charm through it's Java part.

I had the *_2po_4pgr table imported without any problems. Even the *2po_v table gets imported, though I don't completely understand the relation of this table.

I executed the pgr_createTopology function which ran for quite a while (12000secs) while calculating all 6m edges. I thought this would do the deal, but still it's unbearably slow.

I would like to know if i forgot something. I was thinking of using pgRouting instead of the java library but at the moment its performance-wise just out of comparison.

mgri
  • 16,159
  • 6
  • 47
  • 80
Johnny Cusack
  • 431
  • 4
  • 11
  • 1
    have you created indexes, have you tuned postgis memory variables? createTopology is only run once for whole dataset so its performance doesn't matter that much. Side note. I did have whole Finland from digiroad dataset (like 2G of road network) and returned results in max 250 ms, usually 125ms without any optimizations. So it should be better now days – simpleuser001 Aug 26 '13 at 13:13
  • There are indexes on the source and target column automatically created by the osm2po script generator. More needed? I changed the work_mem/maintenance_work_mem variables to a GigaByte value, restarted, still no change. Is there any start up script template i could need? – Johnny Cusack Aug 26 '13 at 13:43
  • 1
    Hmmm... What does createTopology() do? I mean, osm2po already creates the topology based on OSM-Node-IDs. So there is no need to run sth. similar again. For pgRouting (shortest_path & shortest_path_astar) you only need the created 4pgr-table. That's all. – Carsten Aug 26 '13 at 18:10
  • I have now finland dataset , postgis 2.0.3, pgrouting 2.0.0-dev. And i have to say this is slow. allways over 1 sec for result when using pgr_astar(). I check if i get this little bit faster – simpleuser001 Aug 27 '13 at 08:20

3 Answers3

7

Use this guide to set up indexes for a spatial database. Here is the gist of it:

 1. create indexes on ID, source and target columns.
 2. create index using GIST on geom column.
 3. vacuum
 4. cluster on geom column
 5. analyze

for my _4pgr and _vertex tables, only the source and target columns had indexes after the import (osm2po-core-5.1.0).

kttii
  • 3,630
  • 1
  • 16
  • 36
7

Problem with pgRouting performance seems to be that new pgr_astar and pgr_dijkstra use whole graph (which guarantees solution if there is one). Simple solution to get better performance is limit used graph to smaller area. It has it own problems like sometimes it may create graphs that cannot be solved

 (SELECT ST_Expand(ST_Extent(geom_way),0.1) as box  FROM hh_2po_4pgr as l1 WHERE l1.source =7 OR l1.target = 12) 

Creates BBOX over source and target collection and expands it 0.1 degrees, then same query is used to limit graph size in pgr_ query

Dijkstra from 1.2s to ~65ms

SELECT  seq, id1 AS node, id2 AS edge, g.geom_way as the_geom
    FROM pgr_dijkstra(
            'SELECT id, source, target, cost FROM hh_2po_4pgr as r, 
            (SELECT ST_Expand(ST_Extent(geom_way),0.1) as box  FROM hh_2po_4pgr as l1    WHERE l1.source =7 OR l1.target = 12) as box
            WHERE r.geom_way && box.box',
            7, 12, false, false
    ) as r INNER JOIN hh_2po_4pgr as g ON r.id2 = g.id ;

A* from 2s to ~50ms

SELECT seq, id1 AS node, id2 AS edge, cost
    FROM pgr_astar(
           'SELECT id, source, target, cost, x1,y1,x2,y2 FROM hh_2po_4pgr as r, 
             (SELECT ST_Expand(ST_Extent(geom_way),0.1) as box  FROM hh_2po_4pgr as l1    WHERE l1.source =7 OR l1.target = 12) as box
            WHERE r.geom_way && box.box',
            7, 12, false, false
    );

osm2po was used to import data (finland-latest) into postgis table. gist index added to geom_way column and full vacuum analyze run for database. shared memory 1G . workmem 512M

simpleuser001
  • 3,864
  • 18
  • 21
  • I had the same idea with the bounding box, still well over 90 seconds even with memory vars set etc. – Johnny Cusack Aug 27 '13 at 11:14
  • i have 380k lines ? you probably have something like 3M+ lines in routing table ? – simpleuser001 Aug 27 '13 at 11:55
  • I have 7M lines it takes around 120 seconds for a simple query with all indices set. if the query is cached it moves from 40 secs to 1.2secs but still horrible, since after 20 seconds waiting. The whole thing starts again. I give up and try to accomplish everything with java only... – Johnny Cusack Aug 27 '13 at 22:22
  • IMHO, Problem is that pgr_ functions do not limit graph table and/or you cant cache whole routing table (which is interesting question). How does osm2po routing service work for your data ? – simpleuser001 Aug 28 '13 at 08:30
  • 1
    This is one of the main problems in Postgres not to cache the whole graph. It works fairly quick. But i need to connect it with other database-tables which creates in the current (test-)situation a huge bottleneck with just 5qps (queries per second) – Johnny Cusack Aug 28 '13 at 10:46
  • 1
    I just loaded a subset of 1M rows into a ramdisk to compare. pgr_dijkstra takes 3 seconds in a cold run. pgr_astra with the bbox example provided by @simplexio it takes around 900ms for a cold run. So it seems i have to put everything in a ramdisk for proper performance. – Johnny Cusack Aug 28 '13 at 12:24
  • How you did that ? did you create RAMDISK and then defined new tablespace ? – simpleuser001 Aug 28 '13 at 16:22
  • yes i did, i posted an answer including a link – Johnny Cusack Aug 29 '13 at 15:36
  • 1
    Great! with @kttii's indexes I'm running fast now ! – Magno C May 24 '16 at 14:48
6

I finally came to the conclusion that it's best to put the whole graph (including indices) on a separate tablespace which permanently resides in memory using a ramdisk.

For setting up the ramdisk on Ubuntu 13.04 I used the following instructions and must say it's working pretty good (it's includes instructions for reloading the data into memory after a restart/reboot).

Next week I will get a hand on new SSD's (1GB/s read) and try to compare the performance.

As far as I see it's the only solution for keeping a 1M+ rows graph permanently accessible, since there a continuous random reads happening.

Johnny Cusack
  • 431
  • 4
  • 11