This post is part of a series entitled Spatial is not Special, where I will illustrate how spatial constructs in SQL provide us with a rich toolset for geographers doing spatial analysis. To illustrate these concepts, I will be showing examples from my bookStatistical Problem Solving in Geography. Even though PostGRES, SQLServer, MySQL, spatialite, and Oracle all support spatial SQL, these examples will be presented using Manifold GIS. The example dataset is a Manifold 8.0 .map file and can be found here.
In our previous post we saw how easy it was to compute the mean center of a geographic dataset with SQL. Today’s post examines the SQL code necessary to generate the weighted mean center for a geographic data set. Recall from Statistical Problem Solving in Geography (third edition), the formula and computation of weighted mean center and the 7 point data set used.
In this case, the attribute field we are interested in is a column named f. We used f to represent the term frequency (it’s what statisticians do). Calculating the weighted mean center is slightly more complicated than yesterday’s example of the mean center, but is still rather easy to accomplish with SQL.
SELECT sum(x*f)/(SELECT sum(f) FROM Points), sum(y*f)/(SELECT sum(f) FROM Points) FROM Points
Notice that the use of the sum function will sum the product of the coordinate value and the frequency field. This must be divided by the sum of all the frequencies, therefore we are required to issue a second SQL statement to obtain the sum of the frequencies (please note, some SQL engines like that in SQLServer or Oracle allow the user to store variables, so you could obtain the sum of the frequencies and then use that later in the SQL statement – but this is fairly straightforward so we did it all in a single SQL statement).
And once again, we can convert the x,y coordinates into a geometry by using the NewPoint function in our SQL engine:
SELECT NewPoint(num(x*f)/(SELECT sum(f) FROM Points), sum(y*f)/(SELECT sum(f)) FROM Points) FROM Points
Our next post will examine how to calculate the central feature in a dataset using SQL.
Comments