On the **Manifold GIS discussion board**, someone asked about updating a polygon layer with the distance of the farthest pair of vertices for each polygon. I thought that was a fun little conundrum to try out. You can see my Manifold SQL answer **here**.

That made me think about how to write the same query in PostGIS. It’s actually fairly easy, as the code below shows:

SELECT max(ST_Distance(A.g,B.g)) , A.gid FROM (SELECT gid, (ST_Dump(geom)).geom as g FROM floodzone) AS A, (SELECT gid, (ST_Dump(geom)).geom as g FROM floodzone) AS B WHERE A.gid = B.gid GROUP BY A.gid

Now, I am doing it for a layer called **floodzone**. The first thing you have do, as illustrated in lines 3-6 is get the gid and geometry field from the floodzone. We do this twice, because we have to fake Postgres into thinking their are two tables we are selecting from (A and B). The **ST_Dump** function dumps out all the coordinates for each polygon as a *geometry dump*. So, the .geom reference turns the *geometry dump* into actual geometries. So, at the end of lines 3 and 4, you have a table with the gid of the polygon and all the individual point geometries.

Since we have the vertices busted out, we are using line 1 to obtain the maximum distance (ST_Distance) between every vertex. Then, when we issue line 8, we are grouping that maximum distance by the gid. The final result is table that tells us the maximum distance between vertices for each polygon.

**Let’s speed things up**

On this file, it took 15 seconds. So I got to thinking: how can we make it faster. Well, in basic big-*O* notation, the algorithmic complexity of what I’m doing above is *O*(n^2). One way to improve things (especially when you have polygons with massive numbers of vertices), is to reduce the polygon vertices to a convex hull. A convex hull runs in *O*(nlogn), and when you are done you have much fewer vertices. Then, when you run the distance-to-distance calculation, you are running the *O*(n^2) function on much fewer points.

The updated query is simply returns the ST_ConvexHull of the polygons:

SELECT max(ST_Distance(A.g,B.g)) , A.gid FROM (SELECT gid, (ST_Dump((geom))).geom as g FROM floodzone) AS A, (SELECT gid, (ST_Dump((geom))).geom as g FROM floodzone) AS B WHERE A.gid = B.gid GROUP BY A.gid

Running the query this way, the result was achieved in 2 seconds!!

This is why I teach a few lectures on computational geometry in my undergraduate GIS Programming class.

If you want to learn more tricks for using SQL like this, take a look at my Spatial SQL class – hey, for that matter, take a look at all my classes **here**.

Hi Art,

You may also be interested in the ST_LongestLine function

eg

SELECT ST_Length(ST_LongestLine(geom,geom) FROM mypolygontable;

http://postgis.net/docs/ST_LongestLine.html

Thanks! Never ceases to amaze me. Going to try ST_MaxDistance too. Going to see which is faster.

Just a quick update, Phil. I ran this query with a really big dataset, and my original query took 46 seconds. I then modified what you mentioned:

SELECT A.id2, ST_Length(ST_LongestLine(A.geometry,B.geometry))

FROM soils as A, soils as B

WHERE A.id2 = B.id2

and it ran in 11 seconds!

Finally, I tried ST_MaxDistance:

SELECT A.id2, ST_MaxDistance(A.geometry,B.geometry)

FROM soils as A, soils as B

WHERE A.id2 = B.id2

and it too ran in 11 seconds.

Thanks to you, I think I’ll be using this one as an example in class this semester :-)