New Books – How do I do that in PostGIS, How do I do that in Manifold SQL

I have two new books out – How do I do that in PostGISand How do I do that in Manifold SQL.  

From the back cover of How do I do that in PostGIS:

For those who are unsure if SQL is a sufficient language for performing GIS tasks, this book is for you. This guide follows the topic headings from the book How do I do that in ArcGIS/Manifold, as a way to illustrate the capabilities of the PostGIS SQL engine for accomplishing classic GIS tasks. With this book as a resource, users will be able to perform many classic GIS functions using nothing but SQL.

Continue reading

A Poor Man’s Parallel Processor for GIS

In addition to SQL, I also am interested in processing large volumes of spatial data.  One of the newest rages in “big data” is Hadoop.  According to Wikipedia:

Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

One way this is implemented is a programming model called MapReduce.  Don’t get too excited, it doesn’t have anything to do with maps or GIS – but, it is very clever and powerful for certain types of problems.  The concept is if you have a really large dataset, you divide and conquer that dataset in a number of steps.  For example, say we wanted to know all the people with the name “John” in the phonebook, and say we had 26 computers in a cluster – we might solve this by:

1.  Use each computer (1-26) to find all the “Johns” for the first letter in the last name (A-Z).  That way, you have effectively broken the problem into 26 smaller units.

2.  Once each computer has counted up the number of Johns, you have reduced the dataset (hence, MapReduce) to 26 variables.

3.  Now, count up the total of the 26 variables.

That is an oversimplified version of course, but it helps to illustrate what we want to do.  I understand that the University of Minnesota has created a set of functions called SpatialHadoop.  I want to test this over the summer, but for now I decided to create my own poor man’s version using PostGRES. Continue reading