Virtual layers with QGIS – wait a second, this is starting to look like Manifold…

If you are like me, you love writing spatial SQL. But what can you do if you have a geodatabase, some shapefiles, or any other data format? Well, you can always take those data formats and import them into Postgres or SQLite, and then issue the SQL for the results. Sure, it works, but it’s an extra step, and you no longer have access to the desktop GIS tools.

Well, with QGIS, you can drag and drop any data format into your project. And, with the DB Manager, you can refer to those layers as Virtual Layers, allowing you to issue SQL on them. Take a look at this video to see how issuing SQL on disparate data sources is a breeze.

I still think Manifold has the best integrated SQL engine tied to their GIS, but virtual layers in QGIS is seriously compelling, especially with the drag-and-drop capabilities of bringing in the different data sets.

if you want to learn more about spatial SQL or QGIS, check out my courses on Learning the FOSS4g Stack: QGIS Desktop, and Learning the FOSS4g Stack: Spatial SQL with Postgres and PostGIS.

Easy bivariate map with Postgres

A friend recently asked me about the cool looking bivariate maps produced with ArcOnline, lamenting that the capability seemed lacking in ArcGIS. Well, it turns out that ESRI has a .dll you can use, and there is a good article here. So, if you want to create these great looking maps in ArcGIS, it shouldn’t be a problem. The website will allow you to download the .dll, and you can also watch the video on how to use it. Well worth the time.

So of course that got me thinking: could we make the same map with spatial SQL. Well, sure, and it is super easy. If you want to get spun up on what these 9 color bivariate choropleth maps are, and the theory behind it, have a look at this great site. Josh does a great job explaining how this works, but it is a little cumbersome if you want to pump out map after map. But, with a very little bit of SQL, you can easily pull it off.

Let’s start with our data: I have a Postgres table of United States County boundaries, with the attributes percentobese and percentdiabetes, along with a FIPS code and a geometry column.

To prepare the data for the map, we simply issue this SQL query:

SELECT 
fips,geom,
ntile(3) over (order by percentobese) || '.' ||
ntile(3) over (order by percentdiabetes) AS bimode
INTO qlayer
FROM ushealthrisk

yes, that’s it. Really.

I’m not kidding. We’re done, folks. Go home, nothing left to see.

Well, if you insist on reading, I’ll tell you what the SQL does, and how to actually visualize the data.

Continue reading

Big Data Results

I wanted to revisit the taxi data example that I previously blogged about.  I had a 6GB file of 16 million taxi pickup locations and 260 taxi zones.  I wanted to determine the number of pickups in each zone, along with the sum of all the fares.  Below is a more in-depth review of what was done, but for those of you not wanting to read ahead, here are the result highlights:

Platform Command Time
ArcGIS 10.4 AddJoinManagement Out of memory
ArcGIS Pro Summarize Within 1h 27m*
ArcGIS Server Big Data GeoAnalytics with Big Data File Share Summarize Within

Aggregate Points

~2m
Manifold 9 GeomOverlayContained 3m 27s
Postgres/PostGIS ST_Contains 10m 30s
Postgres/PostGIS (optimized) ST_Contains 1m 40s
*I’m happy ArcGIS Pro ran at this speed, but I think it can do better.  This is a geodatabase straight out of the box. I think we can fiddle with indexes and even structuring the data to get things to run faster.  That is something I’ll work on next week.

Continue reading

Finding “Dangles” with PostGIS

Do you have a set of lines that you need to determine if there are any “dangle” nodes?  A dangle is a line segment that overhangs another line segment.  Now, some dangles are valid, like a pipe that terminates in a cul-de-sac.

A few people have posted about this already, but I figured I would give it a shot as well, as I think my SQL is a little more terse.  Anyway, here is the query, and we’ll talk about it line by line:

SELECT DISTINCT g1 ASINTO dangles
FROM plines, 
    (SELECT g AS g1 FROM  
         (SELECT g, count(*) AS cnt  
          FROM  
              (SELECT  ST_StartPoint(g) AS g FROM plines
               UNION ALL
               SELECT  ST_EndPoint(g) AS g FROM plines ) AS T1 
         GROUP BY g) AS T2
     WHERE cnt = 1) AS T3
WHERE ST_Distance(g1, g) BETWEEN 0.01 AND 2;

Continue reading

Multi-Ring (non-overlapping) Buffers with PostGIS

I was interested in creating mult-ring buffers but with a twist: I didn’t want the buffers to overlap one another.  In other words, if I had concentric buffers with distances of 100, 200, and 300 around a point, I want those buffers to reflect distances of 0-100, 100-200, and 200-300.  I don’t want them overlapping one another.  You can actually do that with the PostGIS function ST_SymDifference, but there are a few nuances that you have to be aware of.

Unlike some of my longer videos, this one will start out with the answer, and then we’ll walk through all the SQL.  You’ll see it isn’t so bad.  And, you continue to see that spatial is not special!.  It’s only 20 minutes long, but the answer is shown in the first minute.

In the video I’ll slowly walk you through all the spatial SQL to create buffers for the points and trim all the overlaps so that there are no overlapping buffers.  You’ll learn some really cool Postgres commands  including:

 ST_BufferST_DifferenceSymDISTINCT ON, and SET WITH OIDS.

I found myself amazed that with a few SQL tweaks, we were able to turn ordinary buffers to more useful non-overlapping buffers.  I hope you enjoy the video.

I’d like to create more videos like that – please leave so comments below so that I know others want me to continue these kinds of tutorials.

 If you want to learn more about SQL, programming, open source GIS, or Manifold GIS, check out courses at www.gisadvisor.com.  

GIS Analysis of Overlapping Layers

overlayoverlapMy friend is attempting to quantify the area of different landuse values for different areas that are upstream from her sample points.  This means she needs sample points, landuse, and upstream areas (i.e. sub-watersheds).  The problem is, her watersheds overlap, the buffer distances around the sample points overlap themselves AND the watersheds, and she then needs to summarize the results.  It’s actually a tricky problem due to the overlaps: GIS software doesn’t really like when features within a single layer overlap one another.  Also, if a buffer for a sample point overlaps two different watersheds, that becomes tricky too.

Sure you can solve it with a few for loops,  inserting the results into a new table, but that really is a hassle.  Also, I have to do it for different distances and different land cover types.

So, I once again turned to SQL – remember what I keep telling you – spatial is not special.  It’s just another data type.  This video steps you through performing a multi-ring buffer on overlapping objects from 3 different layers: sample points, watersheds, and land use.  As we step through the SQL, you’ll see how easy it is to put the query together.  And, at the end, you’ll see how flexible the query is should you want to change your objectives.  And, for good measure, we’ll throw in a little bit of parallel processing.

Big Data GeoAnalytics – adding data

Continuing my series on big data geoanalytics, I wanted to show how to bring in large data sets so that we can start working with them. The data set we’ll use is the NYC taxi data that includes information on pickup and dropoffs. There are about 13 million records in a 2.2GB .csv file. That is not insanely large, but it is large enough for us to start messing around with it (don’t worry, I have a few 20GB+ data sets that I am working with and will eventually show that to you as well).

This video below will walk you through the steps I took to load and prepare the NYC taxi data inside of Manifold Future. My next posts will begin to look at how we can begin interrogating the data source to find meaningful information.

I hope you enjoy the video. Please comment below – I’d love to hear what people think.

 

Big Data GeoAnalytics – Turning Points to Lines

In my last video, I gave a short of mile-high view of how SQL can be used for big data geoanalytics.  I want to dive a little deeper, and explore the idea of create linear features from a time-series of points.

Once again, using some basic SQL and spatial SQL, we can perform basic time-series analysis.

I’m enjoying making these videos, as they are helping me put my course on big data and GIS together.  I hope you like them too.  Please comment down below so that I know this is something the user community enjoys and is learning from.

Also, if you are interested in learning more about how to perform spatial SQL in Microsoft SQL Server, Postgres, or Manifold, visit my other site, www.gisadvisor.com to sign up for my online video courses.

Big data geo-analytics with SQL

I’m getting ready to create a course in big data analytics with GIS.  I have lots of ideas as to what to do, but one thing I know is that I will be using spatial databases and SQL.  I’ll also be using Manifold Future.

ESRI has recently introduced their ArcGIS GeoAnalytics Server, which will introduce many GIS professionals to big data analytics with GIS.  They have some interesting scenarios and example data using NYC taxi cabs.  I think these will be really good case studies.

This video (just shy of 20 minutes) will use SQL and Manifold to try and address these big data problems.

Keep an eye on my blog as I will be rolling out new ideas as I prepare my course for the Spring.

if you like the video, and want to learn more about how to improve your spatial database skills, check out my videos at www.gisadvisor.com.

Maryland State GIS Conference (TuGIS)

The TuGIS training workshop on March 20, 2017 is completed – you can see the workshop evaluations below:  

The workshop evaluations are in

(if you want to cut to the chase, the workshop results are here).

I had a great time teaching our two workshops at the TuGIS conference.  In the morning, my students and I presented Spatial SQL: A Language for Geographers, and in the afternoon we taught Python for Geospatial.

We knew expectations would be high: both courses sold-out in 2 days, and we even expanded the class size to 38 people for each workshop!!  I knew that teaching 38 people would be a challenge, but it would also be a great lesson to see if we could corral so many cats into a single, technical workshop.  The workshop evaluations would be crucial to determine if we met our objectives.

The workshop evaluations were overwhelmingly positive.  For example:

  1. over 90% said they enjoyed the workshop.
  2. over 83% said it was much better than other GIS training they have been to.
  3. on a scale of 1-10, 95% of the attendees rated the course a 7 or above.
  4. 93% said they learned something new in the workshop.
  5. 89% said the workshops would help them in their careers.
  6. 91% said they would apply these skills to their job.

I decided to throw one curve-ball on the evaluation sheet and asked:

This was a half-day workshop. Most one-day GIS training classes cost around $600/day. If we developed other in-depth full-day workshops on topics like this for under $250, how likely would you be to participate in it?

it turned out that 89% of the respondents rated a 7 or higher, indicating that almost 90% of the people valued the training enough to pay $250 for a full day course (opposed to $600 for most GIS courses).  This means it is possible to offer really good, low cost training to GIS professionals.  Keep an eye out on this, as I am very likely to take these training classes on the road.

The comments the participants provided were great – it confirmed our belief that this was an excellent training course, and that the course needed to be expanded to 8 hours, rather than 4 hours – most everyone felt like their was simply too much information to absorb.

If you would like to see the results of the workshop evaluation, click the link below:

TuGIS Workshops – Google Forms

Finally, if you can’t make it to a live workshop, all of my video training courses are $30 or less, when you visit www.gisadvisor.com.  These courses can’t get into the level of depth that a live course gives, but you’ll see that after thousands of students taking the courses, close to 90% of them give the course 4 starts out of 5!