Doing a GROUP BY in ArcGIS

As most of you know, I am a big fan of spatial SQL.  It is my go-to tool whenever working with GIS.  But, I have seen too many people using ArcGIS get tripped up with trying to summarize the results of a spatial operation because ArcGIS does not support SQL.  So today, to spare my ArcGIS friends the trouble of writing large “for” loops in Arcpy to populate data that takes hours to run, I want to show you two lines of Arcpy to very quickly replicate the GROUP BY function in SQL:

Using my favorite GIS data set in Tompkins County, NY, assume we have two layers: parcels2007 and watersheds:parwat

 What I want to do is calculate for each watershed, the sum of all the parcel areas.  In SQL, we would create a query statement like:

SELECT Sum(Area(ClipIntersect([Watersheds].id,[Parcels2007].id))) 
       AS watpar, watershed
INTO temptable
FROM [Watersheds],[Parcels2007]
WHERE touches([Watersheds].id,[Parcels2007].id)
GROUP BY watershed

In ArcGIS, you can perform an Intersect_Analysis, and then you can summarize the result of that analysis by taking the SUM of the resulting area (Shape_Area) and essentially group it by the watershed name (watershed):

arcpy.Intersect_analysis(["watersheds", "parcels2007"],"watpar","ALL")

arcpy.Statistics_analysis("watpar","watsum",[["Shape_Area","SUM"]],"watershed")

So  that’s it.  Two lines of Arcpy code.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s