Lets take another break from my textbook, and focus on a classic GIS operation. If you are like me, you learned GIS with ARC/INFO. I started with ARC/INFO 3.2 on a DEC VAX PDP 11. Actually, my first ESRI product was AUTOMAP II on a Burroughs mainframe with the CANDE operating system – we even had to use a SYMAP ruler to plot our points for digitizing.

Anyway, for those of you who used ARC/INFO, you probably found that the NEAR function was one of your best friends. I used it every week to transfer attributes from points to lines, and vice-verse. This is pretty easy to do in SQL, but I will step through it for you to make it more understandable. Lets assume we have two vector drawings: Points and Lines. Obviously, the Points drawing has points in it, and the Lines drawing has lines in it. The SQL to replicate the NEAR function is:

**1. SELECT min(dist), first(lines.id),points.[ID] ****FROM
**

**2. (SELECT lines.id, points.id,distance(points.[Geom (I)],lines.**

**[Geom (I)]**) AS dist**3. FROM points, lines**

**4. ORDER BY dist)**

**5. GROUP BY points.id**

We will actually start our review of the query at lines 2 through 4. This part of the query selects the line IDs and the distance between them – we will call the result of the distance query **dist**. Now, this part of the query selects the distance between every point and every other point, resulting in a table that is *NxN* in size. Line 4 helps us toward our NEAR function result by **ordering** the data by the distance. Therefore, the closest point-to-line pairs are now at the beginning of the results of the query.

Lines 1 and 5 work together as an aggregate query. Here we are returning the **min** distance. Now, what is the min distance? Its the shortest distance between the points and the line – remember that **ORDER BY** we used. Line 5 is really where the magic happens because there are lots of minimum distances in the query result. But, the** GROUP BY** clause aggregates the data by the ID for each point. Therefore, it returns the closest line ID and the distance to each point. The **GROUP BY** statement turns our *NxN* result into an *N* result where *N* represents each of the records in the Points drawing.

Once again, I hope you can see how easy it is to replicate standard GIS functions like NEAR. Sometime in the future, I will show you Intersect, Union, and Identity. But, lets get through a few more examples from my book.

Would one of you be so kind as to ask ESRI to implement spatial SQL into their next release of Arcinfo?

Pingback: k-nearest neighbor with SQL in Radian Studio | gisadvising