Virtual layers with QGIS – wait a second, this is starting to look like Manifold…

If you are like me, you love writing spatial SQL. But what can you do if you have a geodatabase, some shapefiles, or any other data format? Well, you can always take those data formats and import them into Postgres or SQLite, and then issue the SQL for the results. Sure, it works, but it’s an extra step, and you no longer have access to the desktop GIS tools.

Well, with QGIS, you can drag and drop any data format into your project. And, with the DB Manager, you can refer to those layers as Virtual Layers, allowing you to issue SQL on them. Take a look at this video to see how issuing SQL on disparate data sources is a breeze.

I still think Manifold has the best integrated SQL engine tied to their GIS, but virtual layers in QGIS is seriously compelling, especially with the drag-and-drop capabilities of bringing in the different data sets.

if you want to learn more about spatial SQL or QGIS, check out my courses on Learning the FOSS4g Stack: QGIS Desktop, and Learning the FOSS4g Stack: Spatial SQL with Postgres and PostGIS.

Easy bivariate map with Postgres

A friend recently asked me about the cool looking bivariate maps produced with ArcOnline, lamenting that the capability seemed lacking in ArcGIS. Well, it turns out that ESRI has a .dll you can use, and there is a good article here. So, if you want to create these great looking maps in ArcGIS, it shouldn’t be a problem. The website will allow you to download the .dll, and you can also watch the video on how to use it. Well worth the time.

So of course that got me thinking: could we make the same map with spatial SQL. Well, sure, and it is super easy. If you want to get spun up on what these 9 color bivariate choropleth maps are, and the theory behind it, have a look at this great site. Josh does a great job explaining how this works, but it is a little cumbersome if you want to pump out map after map. But, with a very little bit of SQL, you can easily pull it off.

Let’s start with our data: I have a Postgres table of United States County boundaries, with the attributes percentobese and percentdiabetes, along with a FIPS code and a geometry column.

To prepare the data for the map, we simply issue this SQL query:

SELECT 
fips,geom,
ntile(3) over (order by percentobese) || '.' ||
ntile(3) over (order by percentdiabetes) AS bimode
INTO qlayer
FROM ushealthrisk

yes, that’s it. Really.

I’m not kidding. We’re done, folks. Go home, nothing left to see.

Well, if you insist on reading, I’ll tell you what the SQL does, and how to actually visualize the data.

Continue reading

Engaging students with multiuser editing, and FOSS4g

A friend is doing some research on estrogen and estrodial in water samples in Delaware. We have about 50 sub-watersheds that show the upstream contributing area for each water sample. We’ve done some regression analysis to see if we can find any correlation of estrogen with landscape parameters, like land use.


Geography students collaborating on data collection for a research project. Notice there are no windows in this lab – this is the best way to get students to want to go to graduate school so they don’t have to work like this again!


So far, we haven’t been impressed with the results. Nonetheless, I’m not ready to give up, so I thought:

what if there is a relationship with the number of poultry houses upstream of each sample.

The bad news: we don’t have a layer of poultry houses in Delaware.
The good news: I have a bunch of eager students who want to learn GIS.
The solution: Let’s use a lecture to teach students about multi-user, simultaneous editing with Postgres and PostGIS, and have them do it.

So that’s what we did. On Wednesday, I took 18 of my Advanced GIS students, and introduced them to an 8 minute video that would step them through digitizing all the poultry houses in the watershed. We had Postgres running on the teaching computer, and QGIS on 18 workstations in the classrooms. Our basic setup looked like this (with 18 QGIS workstations, of course):

I then had to instruct my students how to do the digitizing. Since people catch on at different paces, the best thing was to just use this video:

After 1h 40m, the students had around 1,000 poultry houses digitized! And, as a learning experience, they had the opportunity to see how an enterprise class database could be stood up in a matter of minutes to facilitate mult-user digitizing. It blew their mind to realize that after 2 hours, we had actually put in 36 man hours of digitizing. They had never used QGIS before, and it only took an 8 minute video to spin them up on the project!

This was easy to do, excited the students to be exposed to multi-user editing and open source GIS, and accomplished an important task for a research project that an individual student is working on. Everyone wins!

I hope this inspires you to come up with some creative ways to introduce GIS concepts to your class.

If you want to learn more about how to build an enterprise GIS with open source tools, check out my courses on www.gisadvisor.com

Big Data Results

I wanted to revisit the taxi data example that I previously blogged about.  I had a 6GB file of 16 million taxi pickup locations and 260 taxi zones.  I wanted to determine the number of pickups in each zone, along with the sum of all the fares.  Below is a more in-depth review of what was done, but for those of you not wanting to read ahead, here are the result highlights:

Platform Command Time
ArcGIS 10.4 AddJoinManagement Out of memory
ArcGIS Pro Summarize Within 1h 27m*
ArcGIS Server Big Data GeoAnalytics with Big Data File Share Summarize Within

Aggregate Points

~2m
Manifold 9 GeomOverlayContained 3m 27s
Postgres/PostGIS ST_Contains 10m 30s
Postgres/PostGIS (optimized) ST_Contains 1m 40s
*I’m happy ArcGIS Pro ran at this speed, but I think it can do better.  This is a geodatabase straight out of the box. I think we can fiddle with indexes and even structuring the data to get things to run faster.  That is something I’ll work on next week.

Continue reading

Finding “Dangles” with PostGIS

Do you have a set of lines that you need to determine if there are any “dangle” nodes?  A dangle is a line segment that overhangs another line segment.  Now, some dangles are valid, like a pipe that terminates in a cul-de-sac.

A few people have posted about this already, but I figured I would give it a shot as well, as I think my SQL is a little more terse.  Anyway, here is the query, and we’ll talk about it line by line:

SELECT DISTINCT g1 ASINTO dangles
FROM plines, 
    (SELECT g AS g1 FROM  
         (SELECT g, count(*) AS cnt  
          FROM  
              (SELECT  ST_StartPoint(g) AS g FROM plines
               UNION ALL
               SELECT  ST_EndPoint(g) AS g FROM plines ) AS T1 
         GROUP BY g) AS T2
     WHERE cnt = 1) AS T3
WHERE ST_Distance(g1, g) BETWEEN 0.01 AND 2;

Continue reading

Multi-Ring (non-overlapping) Buffers with PostGIS

I was interested in creating mult-ring buffers but with a twist: I didn’t want the buffers to overlap one another.  In other words, if I had concentric buffers with distances of 100, 200, and 300 around a point, I want those buffers to reflect distances of 0-100, 100-200, and 200-300.  I don’t want them overlapping one another.  You can actually do that with the PostGIS function ST_SymDifference, but there are a few nuances that you have to be aware of.

Unlike some of my longer videos, this one will start out with the answer, and then we’ll walk through all the SQL.  You’ll see it isn’t so bad.  And, you continue to see that spatial is not special!.  It’s only 20 minutes long, but the answer is shown in the first minute.

In the video I’ll slowly walk you through all the spatial SQL to create buffers for the points and trim all the overlaps so that there are no overlapping buffers.  You’ll learn some really cool Postgres commands  including:

 ST_BufferST_DifferenceSymDISTINCT ON, and SET WITH OIDS.

I found myself amazed that with a few SQL tweaks, we were able to turn ordinary buffers to more useful non-overlapping buffers.  I hope you enjoy the video.

I’d like to create more videos like that – please leave so comments below so that I know others want me to continue these kinds of tutorials.

 If you want to learn more about SQL, programming, open source GIS, or Manifold GIS, check out courses at www.gisadvisor.com.  

Cartography in ArcGIS and QGIS

meghanToday I want to introduce you to another one of my students, Meghan Murphy.  Meghan is an outstanding student, and one of the top undergraduates I have ever worked with (I know, I say that a lot, but they just keep getting better and better).  Even as a Sophomore, Meghan was always helping other students out, even the Seniors – students would seem to wait for Meghan to organize everyone together to study for upcoming exams.

She also has an innate ability to work with GIS, and pick up new things: one day she has never programmed in Python, and the next day, she has a couple of hundred line Python script created and running in ArcGIS!  So, I was so happy when Meghan said she wanted to take a special course in Open Source GIS that I was offering this semester.  We covered QGIS, Postgres/PostGIS, GDAL, and Geoserver.  For her final project, Meghan decided she wanted to compare the cartography capabilities of ArcGIS and QGIS, and make a video about it (maybe she was inspired by my videos, or maybe she just figured after watching Lembo’s videos, how could I do worse!).

Whatever her reason, like everything else she does, this turned out great, especially since she had never done a live tutorial like this.  So, I encourage you to watch the side-by-side comparisons for creating a basic cartographic product in both ArcGIS and QGIS.  It’s about 40 minutes long, but worth every minute: I found that I learned some things I hadn’t known regarding some cartographic tools.  And, on that note, I’ll have more videos from my undergraduates shortly (some built web maps, others built an enterprise GIS with Postgres.

If you want to learn more about open source GIS, Python programming, Spatial SQL, or Spatial Statistics, check out my online courses at www.gisadvisor.com.  

The Web Duel – Last Thoughts

The Web GIS Duel: Final Thoughts

This is a continuation of Mark Balwanz’s blog posts on his creation of web mapping sites using both ESRI and Geoserver.  Today he will talk about his experience creating the site using open source technology.

Over the first three parts of this blog series (here, here, and here), I have laid out my project plan, walked you through my ESRI implementation, and my Open Source implementation. During this fourth and final blog I plan on sharing with you my overall thoughts on both implementations in regards to what I liked and disliked. As I mentioned in Part 1 of this series, everything I share here is just my opinion and is based on this one project. I also want to point out that most of my previous experience, both academic and professional, is based in ESRI and is probably shaping some of my opinions.

I will start by sharing my opinions about my experience building the ESRI version. I found the ESRI build to be quite simple thanks to the enormous amount of information that can be found online. Since the entire stack is from the same company the integration between ArcGIS Desktop, ArcGIS Server, and the API was seamless and very easy to navigate. Publishing data to ArcGIS Server was a breeze thanks to the publishing tools that they have included in the desktop software. I also found it extremely easy to pull ArcGIS Server map services into the application that was built with the API. None of this should come as a surprise since these things were built by the same company and are meant to work together. Another positive I found with ESRI is their documentation. The ArcGIS for Developers website and specifically the JavaScript samples and API reference page contained everything I needed to build my application. The online community of users is quite large as well which provides a tremendous wealth of information within forum posts. I think one of the huge benefits that the ESRI implementation provided was the ArcGIS Server capabilities; the identify task, geometry service (although I did not use it), and many others. I found using ArcGIS for Server to be much easier than GeoServer when it came to writing JavaScript code to query the services. I really did not run into anything with the ESRI implementation that made frustrated me. Of course, if I was building something like this for a company I would have to factor in the cost of the systems as well. The ESRI stack has the advantage of being built by a single company with a single vision of how everything should fit together, and that is something that by its nature the Open Source stack will never have.

So as you can tell I enjoyed my time building the ESRI version, but was really excited to see how the Open Source one would compare. I think I mentioned this in Part 3 of this series, but I was once again very impressed with QGIS. I found it very intuitive and the amount of plugins that you can add is very impressive. The GeoServer Manager plugin made the publishing of WMS to GeoServer a trivial task. As mentioned above, I do prefer ArcGIS Server over GeoServer, but that being said the GeoServer manager page is very easy to use and seems to be well thought out. The open source JavaScript libraries were also quite impressive. Although I ended up switching from Leaflet to OpenLayers 3, I still came away impressed by how easy it was to use Leaflet was and will be looking to use it in the future. The one problem I had with OpenLayers was how hard it was to find accurate information about OpenLayers 3, online. Most everything I found was for OpenLayers 2 and that does not migrate very well to OpenLayers 3. Even the book I bought for OpenLayers 3 had some code that is not included in the library anymore and therefore did not work. The samples page on the OpenLayers 3 website was helpful, but overall I was not impressed by their documentation. Another area where I felt was lacking was the integration of GeoServer and OpenLayers. I found it complicated to perform relatively simple spatial queries against a GeoServer WMS from within OpenLayers. What made this more difficult is that I had to search through two sets of documentation to solve my problem (GeoServer and OpenLayers 3) rather than just one. I do think that some of these problems would probably decrease the more I used GeoServer and OpenLayers, but better documentation would make it easier for people to jump into the Open Source world. After completing this project though, I am pretty confident that the Open Source world has the capabilities to match ESRI and the fact that the software is free is a huge benefit. Additional work to more seamlessly integrate these Open Source projects would go a long in making them more user friendly.

As someone who has spent most of his GIS life working with ESRI, I have to admit it was a little uncomfortable to move into the Open Source world. However, being uncomfortable was a good thing as it pushed me to learn a lot of new technologies. I think Open Source can be a great way for organizations to start using GIS as there is a lot less upfront cost and with a little research you can find the Open Source project that best fits your need.

I hope you have enjoyed this blog series and have maybe learned something that you can use. Please feel free to leave comments if you have any questions and thanks for reading!

Workshops at the Maryland Geospatial Conference

tugisThe Maryland’s Geospatial Conference  () is on March 20/21, 2017.  I first attended TUgis in 1990, and it is always a great conference.  It is not too large, so it is  great way to have extended time with people.  So, if you had a technical question for someone from say ESRI, you could simply stop by their booth and have a chat.

This year I was asked to support the pre-conference workshops.  I will be presenting two workshops with the help of my students.  If you recall, my students are quite good at instructing others about GIS technology.  I’m really looking forward to the conference and interacting with people during the workshop.  Keep in mind, this is not something we are just throwing together – we’ve been spending a lot of time thinking about how to effectively move people through the material so that beginners do not get lost, and more technically savvy people are sufficiently challenged.  We are fanatical about making sure people’s learning experience is excellent.

A description of the courses are found here:

Spatial SQL: A Language for Geographers:  Are you stuck in a rut of only knowing how to use a GIS GUI? Do you want to learn how to automate tasks, but are afraid of computer programming. If so, SQL is the most powerful tool you can learn to help you perform complex GIS tasks. This hands-on course is designed to teach you how SQL can replicate many database and GIS tasks. We will start at a very basic overview and then proceed to more advanced topics related to GIS.

Topics to include:

  • Spatial is NOT Special
  • SQL Data Types
  • Traditional SQL
  • Spatial SQL for Vector and Raster Analysis
  • Spatial SQL for Classic Geographic Analysis

For this class, we’ll be using spatiaLite which is the spatial extension used with SQLite.  This is a great way to get started, as it is very similar to the functionality of Postgres/PostGIS.  If you want to move to enterprise GIS with Postgres or even Oracle or SQLServer, you’ll be in really good shape.

Python for Geospatial: If you are in the field of GIS, you’ve probably heard everyone talking about Python, whether it’s Arcpy in ArcGIS or special Python packages for doing things in open source.  In this hands-on workshop you will learn how Python is used to perform GIS analysis. The workshop will be an introduction to Python, with emphasis on integrating multiple Python plug-ins with ArcGIS and open source GIS.

Topics to include:

  • An overview of Python (variables, statements, I/O, writing code)
  • Python plug-ins for Geospatial (numpy, geocoder, pygal, Postgres)
  • A Taste of Arcpy
  • A Data Analytics Project with Python (for this, we will geocode addresses using Python, perform analysis with open source GIS, take the results into Arcpy to do more GIS analysis, compute statistical results with Python calling Excel, and then create charts and graphs of the results for use on the Internet—without ever opening up a single GIS product.)

If you want to learn more about how to use GIS technology, check out the 9 courses at gisadvisor.com.  

A Typical Class Project at Salisbury University: Evaluating Geocoding Accuracies

I’ve always been proud of our Salisbury University GIS students, and love to push them as far as their little minds can handle it.  You may recall that last Spring I had my Advanced GIS students perform independent GIS projects and present those projects as posters at an Undergraduate Research Conference.  Well, this Fall I am teaching GIS Programming, and have 7 awesome students (pictures and bios to follow).  We started the year off learning spatial SQL with Postgres and PostGIS.  We have now moved into Python, which includes Arcpy as well as other Python packages.

The semester was going so well, and the students were so responsive to anything I asked, I said what the heck, let’s try something crazy.  So, I showed the students how to use two Python geocoding packages (geocoder and censusgeocode) and then said:

why don’t we conduct a research project over the next week to test the match rates and positional accuracies of the Google API and the United Census Bureau API.  

So yeah, I gave them a week to put this together: design, code, analyze, and write.  And, like most of my students at this level, they didn’t disappoint me.  This meant they had to integrate a lot of what they have learned over the years (programming, GIS, statistics, etc.).

I just uploaded their work onto researchgate:

 Click for ResearchGate Article  

I was surprised by how little there is out there in terms of quantitative assessment of geocoding accuracies.  I hope you have a chance to click on the link and check out the working paper (we will submit it to a journal sometime soon).  Also, I included a short abstract below so that you can see the results of our work (note: our paper includes the original data and the source code for performing the geocoding):

Undergraduate Research in Action: Evaluating the positional differences between the Google Maps and the United States Census Bureau geocoding APIs

Abstract:  As part of a class assignment in GIS Programming at Salisbury University, students evaluated 106 geographically known addresses to determine the match rate and positional accuracy obtained using the Google and the United States Census Bureau geocoding application programming interface (API)s. The results showed that 96.2% of the addresses supplied by the Google API were successfully geocoded, while 84% of the addresses supplied by the Census Bureau API were successfully geocoded.  Further, the Google API matched 90% of the addresses with a ROOFTOP designation.  The average positional accuracy of the Google derived addresses were 80m overall, and 65m for those geocoded with the ROOFTOP designation while the Census Bureau positional accuracy was 271.09m.  

So yeah, this is what you can do with 7 GIS undergraduates at Salisbury University: they work hard, fast, and are a very creative bunch.

paper