Finding the farthest distance inside a polygon

On the Manifold GIS discussion board, someone asked about updating a polygon layer with the distance of the farthest pair of vertices for each polygon.  I thought that was a fun little conundrum to try out.  You can see my Manifold SQL answer here.

That made me think about how to write the same query in PostGIS.  It’s actually fairly easy, as the code below shows:


SELECT max(ST_Distance(A.g,B.g)) , A.gid
FROM
(SELECT gid, (ST_Dump(geom)).geom as g
FROM floodzone) AS A,
(SELECT gid, (ST_Dump(geom)).geom as g
FROM floodzone) AS B
WHERE A.gid = B.gid
GROUP BY A.gid

Now, I am doing it for a layer called floodzone.  The first thing you have do, as illustrated in lines 3-6 is get the gid and geometry field from the floodzone.  We do this twice, because we have to fake Postgres into thinking their are two tables we are selecting from (A and B).  The ST_Dump function dumps out all the coordinates for each polygon as a geometry dump.  So, the .geom reference turns the geometry dump into actual geometries. So, at the end of lines 3 and 4, you have a table with the gid of the polygon and all the individual point geometries.

Since we have the vertices busted out, we are using line 1 to obtain the maximum distance (ST_Distance) between every vertex.  Then, when we issue line 8, we are grouping that maximum distance by the gid.  The final result is table that tells us the maximum distance between vertices for each polygon.

Let’s speed things up

On this file, it took 15 seconds.  So I got to thinking: how can we make it faster. Well, in basic big-O notation, the algorithmic complexity of what I’m doing above is O(n^2).  One way to improve  things (especially when you have polygons with massive numbers of vertices), is to reduce the polygon vertices to a convex hull.  A convex hull runs in O(nlogn), and when you are done you have much fewer vertices.  Then, when you run the distance-to-distance calculation, you are running the O(n^2) function on much fewer points.

The updated query is simply returns the ST_ConvexHull of the polygons:


SELECT max(ST_Distance(A.g,B.g)) , A.gid
FROM
(SELECT gid, (ST_Dump((geom))).geom as g
FROM floodzone) AS A,
(SELECT gid, (ST_Dump((geom))).geom as g
FROM floodzone) AS B
WHERE A.gid = B.gid
GROUP BY A.gid

Running the query this way, the result was achieved in 2 seconds!!

This is why I teach a few lectures on computational geometry in my undergraduate GIS Programming class.

If you want to learn more tricks for using SQL like this, take a look at my Spatial SQL class – hey, for that matter, take a look at all my classes here.

 

 

3 thoughts on “Finding the farthest distance inside a polygon

  1. Just a quick update, Phil. I ran this query with a really big dataset, and my original query took 46 seconds. I then modified what you mentioned:

    SELECT A.id2, ST_Length(ST_LongestLine(A.geometry,B.geometry))
    FROM soils as A, soils as B
    WHERE A.id2 = B.id2

    and it ran in 11 seconds!

    Finally, I tried ST_MaxDistance:

    SELECT A.id2, ST_MaxDistance(A.geometry,B.geometry)
    FROM soils as A, soils as B
    WHERE A.id2 = B.id2

    and it too ran in 11 seconds.

    Thanks to you, I think I’ll be using this one as an example in class this semester :-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s