Moran’s I in PostGIS

Just thinking out loud here.  I’ve always been bothered by the complexity of Moran’s I.  Actually, it’s not complex, it’s just math.  And, it’s really nothing more than Pearsons Correlation Coefficient tricked into a spatial context.  So, even calculating Pearsons by hand is a pain.  But really, it is nothing more than simply performing a correlation on two arrays.  In this case, the arrays are the values of adjacent features.  Nowadays, we have great tools for calculating the correlation coefficient, so if you can get two arrays representing the adjacent data, you simply wrap you query into that.  Take a look here as I revisit the Figure 15.4 of my textbook An Introduction to Statistical Problem Solving in Geography:

SELECT corr(a.pctwhite, b.pctwhite)
FROM cleveland AS a, cleveland AS b
WHERE st_touches(a.geometry, b.geometry)
AND a."OID" <> b."OID"

We are simply finding those census tracts that are adjacent to one another and obtaining their respective pctwhite values.  That returns two columns, which we pass into the correlation function (corr).

The results are nearly identical to ESRI’s Morans I index.

What do our statistician friends think?

If you want to learn how to write spatial SQL code, work with Postgres, or understand statistics and geography, check out my courses here

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s