Trade Area Analysis in Postgres / PostGIS

How I used SQL to generate an accumulated sum of spatial data

My friend Tomas does work with business analytics, and wanted to find a way to perform trade area analysis.  He had a bunch of stores, and a map of census areas with populations.  What he wanted to figure out was:

how far do I have to radiate out from each store before I hit 5,000 customers

So for each store, he wanted to basically generate concentric buffers and sum up the population of the census areas before he hit 5,000.  Once he found the census areas, he also wanted to generate a convex hull to show the area.  ESRI has a really nice tool that performs this as part of their business analyst package, called threshold trade areas.

Check it out, it seems pretty cool.

Well, to help my friend I was thinking that I would determine the distance from every store to every census area, and then write a script with a giant for loop to iterate through each record, accumulating the results (of course, I would have to integrate some kind of do..while loop and an if…then to make sure I accumulated different counts for each store until I hit the threshold I wanted.  At that point I began asking myself so, how good of friend is Tomas?

What I did instead was write an SQL script to do it. I’ve color coded it below to explain what I was doing….

SELECT ST_ConvexHull(ST_Collect(g)) as geometry, max(sumpop) as sumpop, name
INTO tradearea
FROM
 (
  SELECT a.name, SUM(a.totpop) AS sumpop, ST_Collect(a.geometry) as g
  FROM 
     (SELECT stores.name, censusunit.totpop, censusunit.geometry,
      ST_Distance(censusunit.geometry,stores.geometry) as dist
      FROM stores, censusunit
      ) AS a,
     (SELECT stores.name, censusunit.totpop, censusunit.geometry,
      ST_Distance(censusunit.geometry,stores.geometry) as dist
      FROM stores, censusunit
      ) AS b
  WHERE a.name = b.name
  AND a.dist <= b.dist
  GROUP BY a.name, b.dist
  ) AS T1
WHERE sumpop < 5000
GROUP BY name

The middle portion in orange collects the names of the stores, the population of the census areas, the distance between each store and each census area, and the geometry of the census areas for each combination of stores and census areas.  So, if you have 5 stores and 1,000 census areas, you would have a table with 5,000 rows: Continue reading

SQL and ogr to create 4D lines

A friend recently asked me to help him generate polyline shapefiles with Z and M values that he could deliver to a customer.  The problem was, the software he was using supported the import of Z and M values, but did not support the export of those files.  The other problem was, he has zillions of data tables that he needed to export!

Fortunately, PostGIS allows us to create Polyline ZM data rather easily.  The next part was to figure out how to get it exported to a shapefile.  So, here goes:

Assume you have a table with 4 columns: lineID, Z, M, and geometry.  The geometry field is a point feature that represent vertices on a line.  The lineID separates all those points by which line they are part of, and the Z and M values are, well, Z and M.

Make the Polyline ZM

The SQL command to create a 4D line is:

SELECT  lineid, 
ST_MakeLine(ST_MakePoint(ST_X(geometry), 
                         ST_Y(geometry), z, m)
            ) AS g
FROM zpts
GROUP BY lineid;

In this case, I am making a line (ST_MakeLine) as a series of X (ST_X), Y (ST_Y), Z, and M values.  Grouping the results of the query by the lineid allows me to create a line for each set of points. Continue reading