Finding “Dangles” with PostGIS

Do you have a set of lines that you need to determine if there are any “dangle” nodes?  A dangle is a line segment that overhangs another line segment.  Now, some dangles are valid, like a pipe that terminates in a cul-de-sac.

A few people have posted about this already, but I figured I would give it a shot as well, as I think my SQL is a little more terse.  Anyway, here is the query, and we’ll talk about it line by line:

SELECT DISTINCT g1 ASINTO dangles
FROM plines, 
    (SELECT g AS g1 FROM  
         (SELECT g, count(*) AS cnt  
          FROM  
              (SELECT  ST_StartPoint(g) AS g FROM plines
               UNION ALL
               SELECT  ST_EndPoint(g) AS g FROM plines ) AS T1 
         GROUP BY g) AS T2
     WHERE cnt = 1) AS T3
WHERE ST_Distance(g1, g) BETWEEN 0.01 AND 2;

The first thing to notice is the most inner select statement.  We are using ST_StartPoint and ST_EndPoint to grab the endpoints of the lines – these we’ll call nodes.

The next line to notice is where we are getting the count of the nodes.  We are grabbing all the nodes, but using the GROUP BY function to return the number of nodes that occupy a place in space.  Now, an intersection of two lines will have 2 nodes (from the first line and the second line).  But, a “dangle” will only have one node occupying a space.  This is where the next section of SQL comes into play.

What we want to do is only select those nodes where the count (cnt) is equal to 1.  That means the node is just sitting there in space.  It is a “dangle”.  But, not all dangles are created equally, as I said above.  That final WHERE clause lets me specify how far I want a dangle node to be from another node.  In the example above, we are choosing under 2m apart.  The last bit of SQL we have to consider is the DISTINCT clause.  Nodes can be near one or more lines.  We don’t want to double count them, so using DISTINCT eliminates the duplicates.

That’s it!  Pretty easy.  Think of the ST_Distance function as a variant of the basic SQL to find dangles.  There are other variants we could add to this if we’d like, such as the length of the line the dangle touches has to be less than 5m, or something like that.  That would be just a matter of adding another WHERE clause.

 

 

Multi-Ring (non-overlapping) Buffers with PostGIS

I was interested in creating mult-ring buffers but with a twist: I didn’t want the buffers to overlap one another.  In other words, if I had concentric buffers with distances of 100, 200, and 300 around a point, I want those buffers to reflect distances of 0-100, 100-200, and 200-300.  I don’t want them overlapping one another.  You can actually do that with the PostGIS function ST_SymDifference, but there are a few nuances that you have to be aware of.

Unlike some of my longer videos, this one will start out with the answer, and then we’ll walk through all the SQL.  You’ll see it isn’t so bad.  And, you continue to see that spatial is not special!.  It’s only 20 minutes long, but the answer is shown in the first minute.

In the video I’ll slowly walk you through all the spatial SQL to create buffers for the points and trim all the overlaps so that there are no overlapping buffers.  You’ll learn some really cool Postgres commands  including:

 ST_BufferST_DifferenceSymDISTINCT ON, and SET WITH OIDS.

I found myself amazed that with a few SQL tweaks, we were able to turn ordinary buffers to more useful non-overlapping buffers.  I hope you enjoy the video.

I’d like to create more videos like that – please leave so comments below so that I know others want me to continue these kinds of tutorials.

 If you want to learn more about SQL, programming, open source GIS, or Manifold GIS, check out courses at www.gisadvisor.com.  

GIS Analysis of Overlapping Layers

overlayoverlapMy friend is attempting to quantify the area of different landuse values for different areas that are upstream from her sample points.  This means she needs sample points, landuse, and upstream areas (i.e. sub-watersheds).  The problem is, her watersheds overlap, the buffer distances around the sample points overlap themselves AND the watersheds, and she then needs to summarize the results.  It’s actually a tricky problem due to the overlaps: GIS software doesn’t really like when features within a single layer overlap one another.  Also, if a buffer for a sample point overlaps two different watersheds, that becomes tricky too.

Sure you can solve it with a few for loops,  inserting the results into a new table, but that really is a hassle.  Also, I have to do it for different distances and different land cover types.

So, I once again turned to SQL – remember what I keep telling you – spatial is not special.  It’s just another data type.  This video steps you through performing a multi-ring buffer on overlapping objects from 3 different layers: sample points, watersheds, and land use.  As we step through the SQL, you’ll see how easy it is to put the query together.  And, at the end, you’ll see how flexible the query is should you want to change your objectives.  And, for good measure, we’ll throw in a little bit of parallel processing.

Big Data GeoAnalytics – adding data

Continuing my series on big data geoanalytics, I wanted to show how to bring in large data sets so that we can start working with them. The data set we’ll use is the NYC taxi data that includes information on pickup and dropoffs. There are about 13 million records in a 2.2GB .csv file. That is not insanely large, but it is large enough for us to start messing around with it (don’t worry, I have a few 20GB+ data sets that I am working with and will eventually show that to you as well).

This video below will walk you through the steps I took to load and prepare the NYC taxi data inside of Manifold Future. My next posts will begin to look at how we can begin interrogating the data source to find meaningful information.

I hope you enjoy the video. Please comment below – I’d love to hear what people think.

 

Big Data GeoAnalytics – Turning Points to Lines

In my last video, I gave a short of mile-high view of how SQL can be used for big data geoanalytics.  I want to dive a little deeper, and explore the idea of create linear features from a time-series of points.

Once again, using some basic SQL and spatial SQL, we can perform basic time-series analysis.

I’m enjoying making these videos, as they are helping me put my course on big data and GIS together.  I hope you like them too.  Please comment down below so that I know this is something the user community enjoys and is learning from.

Also, if you are interested in learning more about how to perform spatial SQL in Microsoft SQL Server, Postgres, or Manifold, visit my other site, www.gisadvisor.com to sign up for my online video courses.

Big data geo-analytics with SQL

I’m getting ready to create a course in big data analytics with GIS.  I have lots of ideas as to what to do, but one thing I know is that I will be using spatial databases and SQL.  I’ll also be using Manifold Future.

ESRI has recently introduced their ArcGIS GeoAnalytics Server, which will introduce many GIS professionals to big data analytics with GIS.  They have some interesting scenarios and example data using NYC taxi cabs.  I think these will be really good case studies.

This video (just shy of 20 minutes) will use SQL and Manifold to try and address these big data problems.

Keep an eye on my blog as I will be rolling out new ideas as I prepare my course for the Spring.

if you like the video, and want to learn more about how to improve your spatial database skills, check out my videos at www.gisadvisor.com.

Maryland State GIS Conference (TuGIS)

The TuGIS training workshop on March 20, 2017 is completed – you can see the workshop evaluations below:  

The workshop evaluations are in

(if you want to cut to the chase, the workshop results are here).

I had a great time teaching our two workshops at the TuGIS conference.  In the morning, my students and I presented Spatial SQL: A Language for Geographers, and in the afternoon we taught Python for Geospatial.

We knew expectations would be high: both courses sold-out in 2 days, and we even expanded the class size to 38 people for each workshop!!  I knew that teaching 38 people would be a challenge, but it would also be a great lesson to see if we could corral so many cats into a single, technical workshop.  The workshop evaluations would be crucial to determine if we met our objectives.

The workshop evaluations were overwhelmingly positive.  For example:

  1. over 90% said they enjoyed the workshop.
  2. over 83% said it was much better than other GIS training they have been to.
  3. on a scale of 1-10, 95% of the attendees rated the course a 7 or above.
  4. 93% said they learned something new in the workshop.
  5. 89% said the workshops would help them in their careers.
  6. 91% said they would apply these skills to their job.

I decided to throw one curve-ball on the evaluation sheet and asked:

This was a half-day workshop. Most one-day GIS training classes cost around $600/day. If we developed other in-depth full-day workshops on topics like this for under $250, how likely would you be to participate in it?

it turned out that 89% of the respondents rated a 7 or higher, indicating that almost 90% of the people valued the training enough to pay $250 for a full day course (opposed to $600 for most GIS courses).  This means it is possible to offer really good, low cost training to GIS professionals.  Keep an eye out on this, as I am very likely to take these training classes on the road.

The comments the participants provided were great – it confirmed our belief that this was an excellent training course, and that the course needed to be expanded to 8 hours, rather than 4 hours – most everyone felt like their was simply too much information to absorb.

If you would like to see the results of the workshop evaluation, click the link below:

TuGIS Workshops – Google Forms

Finally, if you can’t make it to a live workshop, all of my video training courses are $30 or less, when you visit www.gisadvisor.com.  These courses can’t get into the level of depth that a live course gives, but you’ll see that after thousands of students taking the courses, close to 90% of them give the course 4 starts out of 5!

 

Radian can read ESRI geodatabases

radianI just got the new build of Radian Studio, and now that it can directly read geodatabases, you can link directly to the geodatabase from inside of Radian and perform Radian spatial queries on it.  In this video, I’m linking directly to an ESRI geodatabase, creating a small map display of the data, performing a spatial clip of two vector layers, and returning the results.   In my previous video, I showed how Radian studio can read data directly from PostgreSQL, SQLite, and also easily exchange data between them.   This is just another example of how Radian can manage disparate GIS databases.

 

l hope you like the video, if so, consider learning more about Radian Studio in my online course here.

Workshops at the Maryland Geospatial Conference

tugisThe Maryland’s Geospatial Conference  () is on March 20/21, 2017.  I first attended TUgis in 1990, and it is always a great conference.  It is not too large, so it is  great way to have extended time with people.  So, if you had a technical question for someone from say ESRI, you could simply stop by their booth and have a chat.

This year I was asked to support the pre-conference workshops.  I will be presenting two workshops with the help of my students.  If you recall, my students are quite good at instructing others about GIS technology.  I’m really looking forward to the conference and interacting with people during the workshop.  Keep in mind, this is not something we are just throwing together – we’ve been spending a lot of time thinking about how to effectively move people through the material so that beginners do not get lost, and more technically savvy people are sufficiently challenged.  We are fanatical about making sure people’s learning experience is excellent.

A description of the courses are found here:

Spatial SQL: A Language for Geographers:  Are you stuck in a rut of only knowing how to use a GIS GUI? Do you want to learn how to automate tasks, but are afraid of computer programming. If so, SQL is the most powerful tool you can learn to help you perform complex GIS tasks. This hands-on course is designed to teach you how SQL can replicate many database and GIS tasks. We will start at a very basic overview and then proceed to more advanced topics related to GIS.

Topics to include:

  • Spatial is NOT Special
  • SQL Data Types
  • Traditional SQL
  • Spatial SQL for Vector and Raster Analysis
  • Spatial SQL for Classic Geographic Analysis

For this class, we’ll be using spatiaLite which is the spatial extension used with SQLite.  This is a great way to get started, as it is very similar to the functionality of Postgres/PostGIS.  If you want to move to enterprise GIS with Postgres or even Oracle or SQLServer, you’ll be in really good shape.

Python for Geospatial: If you are in the field of GIS, you’ve probably heard everyone talking about Python, whether it’s Arcpy in ArcGIS or special Python packages for doing things in open source.  In this hands-on workshop you will learn how Python is used to perform GIS analysis. The workshop will be an introduction to Python, with emphasis on integrating multiple Python plug-ins with ArcGIS and open source GIS.

Topics to include:

  • An overview of Python (variables, statements, I/O, writing code)
  • Python plug-ins for Geospatial (numpy, geocoder, pygal, Postgres)
  • A Taste of Arcpy
  • A Data Analytics Project with Python (for this, we will geocode addresses using Python, perform analysis with open source GIS, take the results into Arcpy to do more GIS analysis, compute statistical results with Python calling Excel, and then create charts and graphs of the results for use on the Internet—without ever opening up a single GIS product.)

If you want to learn more about how to use GIS technology, check out the 9 courses at gisadvisor.com.  

Undergraduate Geospatial Python Projects

earthballThis week my GIS Programming students presented their programming projects to ESRI. First, I cannot say enough to thank ESRI for taking time out of their schedule to meet with our students – the staff was helpful, encouraging, and provided great feedback to the students – what an honor it was to get their feedback.  I am so thankful to be a part of a GIS community that is so supportive of one another.

Now, this was a really special class of undergraduates – and some of them were part of that special group of students that presented their research at an undergraduate conference.  It was small, so we could do some really cool things.  In fact, in the middle of the semester, the students wrote a paper comparing the geocoding accuracies of Google Maps and the United States Census Bureau.

Things were going so well that I decided in lieu of a final exam, we expanded their final projects a little more, and arranged for the staff at ESRI Charlotte and ESRI Redlands to join us on a WebEx that included demonstrations and a code walk-through.  Below are each students’ presentation, and some of the Q&A from ESRI:

noahNoah Krach.  Noah is an amazing undergraduate.  Recently, we lost one of our graduate research associates, and Noah stepped in to provide technical support on a National Science Foundation project in Lake Victoria.  Without missing a beat, Noah was all over the project, and he used his time in my class to create an Arcpy tool to extract, translate, and load (ETL) gigabytes of Landsat imagery.  This tool does a lot, and I can’t even begin to describe all he did, you’ll simply have to watch and learn.

Check out his video, and you’ll see why we are so excited that Noah will be around for another semester.

cc

Caitlin Curry.  If you follow my blog, you’ve already met Caitlin.  She finished her summer internship I told you about, and during the middle of it, her boss wrote us to say what an excellent worker she was (he prefaced his email by saying he never does that, but was so impressed with Caitlin, he had to let us know).  We are impressed with Caitlin, too.  And, as I have now grown to expect, Caitlin did an amazing job with another ETL type tool using Arcpy, where she downloaded, unzipped, and processed earthquake data and critical infrastructure.

I did a lot of emergency response work with earthquakes in a previous life, and what Caitlin did here would have been so useful.  I think you will enjoy seeing how she integrated many different Python packages with Arcpy to provide an early warning application for emergency responders.  And just as a heads-up, Caitlin uses Python to download everything while the script is running – so you just give the script to a user and it works without any operator knowledge of the underlying data = really cool, and efficient.

mb

Matthew Bucklew.  After my first lecture this semester, Matt told me he built his own computer this summer – just for fun.  So, I knew he wasn’t your ordinary  geographer – he likes to try new things, and if something is done in a conventional way, Matt is going to try and be more innovative.  Matt created a great Arcpy application to locate renewable engery stations needed by automobiles.  His Python scripts use ArcGIS for analysis, but at the same time, seamlessly brings in the Google APIs to provide directions to the nearest locations.  For good measure, he also brings in other packages like heapq.

At the moment, Matt’s program works on a desktop, but his hope it to turn this application into a cloud based solution for use with mobile phones.  Keep an eye out for what Matt comes up with, and if you watch this, you’ll see it is an excellent tutorial on how to mash up bunches of Python packages with Arcpy.

jmJessica Molnar.  Like Caitlin, Jessica is another student you’ve seen before.  She’s got such a big heart, and is always looking for ways to apply GIS to humanitarian and ecological solutions.  In this project, Jessica created an Arcpy application to identify locations for community gardens in Baltimore City with special consideration for locations within food deserts, near churches and schools, and on suitable soils for growing food.  Jessica’s program also found those locations that were already owned by the City, but were vacant.  Let’s hope the City makes use of this to build a more beautiful Baltimore (BTW, Jessica wrote her program to work in any location in the State of Maryland, so any community can use this tool!).  I think Jessica may eventually roll this into a cloud based solution – hey Jessica, I think we found a project for graduate school!

 

jtJohn Tilghman.  John’s family owns an orthodontist practice, and John decided to use PostGRES/PostGIS along with a number of other different Python packages to perform market area analysis.  John integrated PostGRES, Google, and the Pygal libraries to create the first stages of a geodashboard to assess the effectiveness of marketing strategies, and other metrics.  In the video, you’ll also see how he created a distance decay algorithm in SQL to determine at what point customers drop off from visiting the practice.  With just a little bit of information (addresses and marketing strategies), John was able to extract a ton of business information – in fact, our guests from ESRI were surprised the John wasn’t already a business major!

This is an excellent presentation to watch for those of you who are interested in using Python with Open Source GIS – you’ll learn how to integrate FOSS4g and Python for a business analytics tool.

 

jyJosh Young.  Josh created an Arcpy script to assemble tons of location based data that might be useful for someone thinking about moving to a particular location.  Now, in Josh’s case, he chose location based data he deemed important for the neighborhood (download speeds, elementary school, crime statistics, distance to the downtown, etc.).  But ultimately, what Josh has shown us is how to create a template that integrates multiple Python packages and online data to provide very useful information.

It would be so easy to take Josh’s work and roll it into a site specific location-based analysis engine.  In fact, one of the people watching Josh’s presentation mentioned that he was moving, and saw how useful this could be for a community.  The best part of it is that Josh did it with all freely available online data for the State of Maryland, so any community can spin this up into a cloud-based solution.

 

image2

Robbie Stancil.  Robbie is our only non-geography major.  You’ve met him before when he worked with me on a National Science Foundation project to use Spatial Hadoop.  Like John, Robbie’s project used Postgres/PostGIS and the Google API to do something quite interesting: he created a mesh of points over community to determine how far the Google API will search in order to find a property address, and compared the concave hull of each series of points for an address to the actual property parcel.  This project got us thinking about some very creative uses – you’ll have to watch it until the end to see the interesting things we came up with.

 

Again, I have to give a huge shout out to the ESRI staff – they were wonderful guests, and really excellent mentors during the Q&A. As these students get ready to graduate in May, I know they will make excellent employees or graduate students – the future is really bright for them. If you are in academia, I hope that you are inspired to expect the very best of your students as I do, and you’ll be so pleased to see what they are capable of doing.

want to learn how to program geospatial solutions like these students? Check out the geospatial courses at gisadvisor.com.