Undergraduate Geospatial Python Projects

earthballThis week my GIS Programming students presented their programming projects to ESRI. First, I cannot say enough to thank ESRI for taking time out of their schedule to meet with our students – the staff was helpful, encouraging, and provided great feedback to the students – what an honor it was to get their feedback.  I am so thankful to be a part of a GIS community that is so supportive of one another.

Now, this was a really special class of undergraduates – and some of them were part of that special group of students that presented their research at an undergraduate conference.  It was small, so we could do some really cool things.  In fact, in the middle of the semester, the students wrote a paper comparing the geocoding accuracies of Google Maps and the United States Census Bureau.

Things were going so well that I decided in lieu of a final exam, we expanded their final projects a little more, and arranged for the staff at ESRI Charlotte and ESRI Redlands to join us on a WebEx that included demonstrations and a code walk-through.  Below are each students’ presentation, and some of the Q&A from ESRI:

noahNoah Krach.  Noah is an amazing undergraduate.  Recently, we lost one of our graduate research associates, and Noah stepped in to provide technical support on a National Science Foundation project in Lake Victoria.  Without missing a beat, Noah was all over the project, and he used his time in my class to create an Arcpy tool to extract, translate, and load (ETL) gigabytes of Landsat imagery.  This tool does a lot, and I can’t even begin to describe all he did, you’ll simply have to watch and learn.

Check out his video, and you’ll see why we are so excited that Noah will be around for another semester.

cc

Caitlin Curry.  If you follow my blog, you’ve already met Caitlin.  She finished her summer internship I told you about, and during the middle of it, her boss wrote us to say what an excellent worker she was (he prefaced his email by saying he never does that, but was so impressed with Caitlin, he had to let us know).  We are impressed with Caitlin, too.  And, as I have now grown to expect, Caitlin did an amazing job with another ETL type tool using Arcpy, where she downloaded, unzipped, and processed earthquake data and critical infrastructure.

I did a lot of emergency response work with earthquakes in a previous life, and what Caitlin did here would have been so useful.  I think you will enjoy seeing how she integrated many different Python packages with Arcpy to provide an early warning application for emergency responders.  And just as a heads-up, Caitlin uses Python to download everything while the script is running – so you just give the script to a user and it works without any operator knowledge of the underlying data = really cool, and efficient.

mb

Matthew Bucklew.  After my first lecture this semester, Matt told me he built his own computer this summer – just for fun.  So, I knew he wasn’t your ordinary  geographer – he likes to try new things, and if something is done in a conventional way, Matt is going to try and be more innovative.  Matt created a great Arcpy application to locate renewable engery stations needed by automobiles.  His Python scripts use ArcGIS for analysis, but at the same time, seamlessly brings in the Google APIs to provide directions to the nearest locations.  For good measure, he also brings in other packages like heapq.

At the moment, Matt’s program works on a desktop, but his hope it to turn this application into a cloud based solution for use with mobile phones.  Keep an eye out for what Matt comes up with, and if you watch this, you’ll see it is an excellent tutorial on how to mash up bunches of Python packages with Arcpy.

jmJessica Molnar.  Like Caitlin, Jessica is another student you’ve seen before.  She’s got such a big heart, and is always looking for ways to apply GIS to humanitarian and ecological solutions.  In this project, Jessica created an Arcpy application to identify locations for community gardens in Baltimore City with special consideration for locations within food deserts, near churches and schools, and on suitable soils for growing food.  Jessica’s program also found those locations that were already owned by the City, but were vacant.  Let’s hope the City makes use of this to build a more beautiful Baltimore (BTW, Jessica wrote her program to work in any location in the State of Maryland, so any community can use this tool!).  I think Jessica may eventually roll this into a cloud based solution – hey Jessica, I think we found a project for graduate school!

 

jtJohn Tilghman.  John’s family owns an orthodontist practice, and John decided to use PostGRES/PostGIS along with a number of other different Python packages to perform market area analysis.  John integrated PostGRES, Google, and the Pygal libraries to create the first stages of a geodashboard to assess the effectiveness of marketing strategies, and other metrics.  In the video, you’ll also see how he created a distance decay algorithm in SQL to determine at what point customers drop off from visiting the practice.  With just a little bit of information (addresses and marketing strategies), John was able to extract a ton of business information – in fact, our guests from ESRI were surprised the John wasn’t already a business major!

This is an excellent presentation to watch for those of you who are interested in using Python with Open Source GIS – you’ll learn how to integrate FOSS4g and Python for a business analytics tool.

 

jyJosh Young.  Josh created an Arcpy script to assemble tons of location based data that might be useful for someone thinking about moving to a particular location.  Now, in Josh’s case, he chose location based data he deemed important for the neighborhood (download speeds, elementary school, crime statistics, distance to the downtown, etc.).  But ultimately, what Josh has shown us is how to create a template that integrates multiple Python packages and online data to provide very useful information.

It would be so easy to take Josh’s work and roll it into a site specific location-based analysis engine.  In fact, one of the people watching Josh’s presentation mentioned that he was moving, and saw how useful this could be for a community.  The best part of it is that Josh did it with all freely available online data for the State of Maryland, so any community can spin this up into a cloud-based solution.

 

image2

Robbie Stancil.  Robbie is our only non-geography major.  You’ve met him before when he worked with me on a National Science Foundation project to use Spatial Hadoop.  Like John, Robbie’s project used Postgres/PostGIS and the Google API to do something quite interesting: he created a mesh of points over community to determine how far the Google API will search in order to find a property address, and compared the concave hull of each series of points for an address to the actual property parcel.  This project got us thinking about some very creative uses – you’ll have to watch it until the end to see the interesting things we came up with.

 

Again, I have to give a huge shout out to the ESRI staff – they were wonderful guests, and really excellent mentors during the Q&A. As these students get ready to graduate in May, I know they will make excellent employees or graduate students – the future is really bright for them. If you are in academia, I hope that you are inspired to expect the very best of your students as I do, and you’ll be so pleased to see what they are capable of doing.

want to learn how to program geospatial solutions like these students? Check out the geospatial courses at gisadvisor.com.   

Arcpy, Google, and Routing

I thought I would show you how I’ve been working with my students in my GIS Programming class, and showing them SQL, Python, and ArcGIS.  One of the fun things about Python packages is that there are tons of packages out there, and lots of things we can do with them – and it only takes a few lines of code.

One of the things my students do in Advanced GIS is create their own route-able networks in ArcGIS, and then we run a traveling salesman problem.  The scenario is that we take 20 banks in the local area and figure out what route we would take if we wanted to rob all the banks.

As you know, building routes in ArcGIS is complex – that isn’t ArcGIS’ fault – it’s just that there are lots and lots of things to think about.  But, it is important for the students to learn how to do this so that they can build their own if they have to.

Well, in my GIS Programming class, I showed them how to recreate the process with Google Maps.  The code window is shown below, and comments will follow:

routing

Continue reading

Working with Manifold Forms

I love using forms in Manifold GIS to rapidly create applications for users.  Sure, you can create world class forms and applications with Visual Studio, but in literally minutes, you can create a functioning GIS application that you can start using immediately, or hand off to a friend.  Click on the image below, and I’ll show you how to do it:formintro

want to learn how to program Manifold GIS with VBScript?  Check out Introduction to Manifold GIS Scripting, or any of the other 8 courses I now offer online at www.gisadvisor.com 

PostGIS Commands in Arcpy

Most of you know, I love spatial SQL.  I also love PostGIS and ArcGIS – can’t we just all get along?  Well, yes we can.  In this case, Python becomes the great mediator.

Check out this video where I analyze data from an ESRI shapefile using Arcpy, call PostGIS to run a procedure on the data using the psycopg2 package, and return the results.

In this case, I am trying to compute the distance between points and line in meters, but the data is in Lat/Lon.  So, I am bringing PostGIS into the mix because it does a very good job of returning distances between points and lines.

appg

Now, you can do all of this in Arcpy directly by getting the geometry object:

import arcpy, numpy

pts = arcpy.MakeFeatureLayer_management("c:/temp/bkpts.shp")
selpts = arcpy.SelectLayerByAttribute_management(pts,"NEW_SELECTION",'POP100 > 10')

rivers = arcpy.MakeFeatureLayer_management("c:/temp/rivers.shp")

for pt in arcpy.da.SearchCursor(selpts, ["SHAPE@"]):
 ptutm = pt[0].projectAs(arcpy.SpatialReference(26918))
 for riv in arcpy.da.SearchCursor(rivers,["SHAPE@"]):
 rivutm = riv[0].projectAs(arcpy.SpatialReference(26918))
 dist = rivutm.distanceTo(ptutm)
 print dist

but, my point here was to show that you can move along with an Arcpy script, take a short and painless detour to use PostGIS, and then get back to your Arcpy script (assuming that PostGIS does something that you are really interested in using).

want to learn more about how to program with Python, Arcpy, and PostGIS?  Check out my courses on on gisadvisor.com.

Working with multiple Python versions

This is something I’m sure many of you already know, but it was new to me, so I figured that I’d share it here.  My student, Carl Flint, showed this to me last night.

Like many of you, I do Python programming in both ArcGIS and other open source platforms. This can sometimes be a real trick, as the ArcGIS Python is located in C:\Python27\ArcGIS10.3\……, while other Python packages are located in C:\Python27.

Yesterday in class, I wanted my students to take data from a .csv file, use geocoder to geocode locations, push the locations into PostGIS, and run a spatial query, and then run pygal to create a pie chart of the results – sort of a mini-geodashboard in the making.  That was really fun for the students to see how Python can get all these products interacting with one another.

But, my GIS Programming course also includes lessons on Arcpy, so I wanted to have them do the same thing except to use ArcGIS to perform the spatial analysis task.  The problem was, the Python libraries in Arcpy are located in a different spot than the regular Python 2.7.  And,  I was having difficulty installing pip in the ArcGIS directory.

So, Carl showed me how to use my Python27 directory to load the Python packages into the ArcGIS directory.  Under normal circumstances, you simply issue (replace psycopg2 with your favorite Python package like geocoder, pygal, etc.):

python -m pip install psycopg2 --target=C:\Python27\ArcGIS10.3\Lib\site-packages

Now, in our case there were other dependency problems, so I also had to issue the following two commands:

python -m pip install distribute --target=C:\Python27\ArcGIS10.3\Lib\site-packages
python -m pip install setuptools --target=C:\Python27\ArcGIS10.3\Lib\site-packages

That was it.  Now, I can use Arcpy to interact with all these really cool Python packages.

As I said, I know a lot of you may already know this, but I hope it helps others.

Want to learn how to program using Python for Geospatial?  Check out my Python for Geospatial course,  or any of the 9 other geospatial courses I offer on Udemy here.  The links will get you any of the courses for $30 or less.

A Typical Class Project at Salisbury University: Evaluating Geocoding Accuracies

I’ve always been proud of our Salisbury University GIS students, and love to push them as far as their little minds can handle it.  You may recall that last Spring I had my Advanced GIS students perform independent GIS projects and present those projects as posters at an Undergraduate Research Conference.  Well, this Fall I am teaching GIS Programming, and have 7 awesome students (pictures and bios to follow).  We started the year off learning spatial SQL with Postgres and PostGIS.  We have now moved into Python, which includes Arcpy as well as other Python packages.

The semester was going so well, and the students were so responsive to anything I asked, I said what the heck, let’s try something crazy.  So, I showed the students how to use two Python geocoding packages (geocoder and censusgeocode) and then said:

why don’t we conduct a research project over the next week to test the match rates and positional accuracies of the Google API and the United Census Bureau API.  

So yeah, I gave them a week to put this together: design, code, analyze, and write.  And, like most of my students at this level, they didn’t disappoint me.  This meant they had to integrate a lot of what they have learned over the years (programming, GIS, statistics, etc.).

I just uploaded their work onto researchgate:

 Click for ResearchGate Article  

I was surprised by how little there is out there in terms of quantitative assessment of geocoding accuracies.  I hope you have a chance to click on the link and check out the working paper (we will submit it to a journal sometime soon).  Also, I included a short abstract below so that you can see the results of our work (note: our paper includes the original data and the source code for performing the geocoding):

Undergraduate Research in Action: Evaluating the positional differences between the Google Maps and the United States Census Bureau geocoding APIs

Abstract:  As part of a class assignment in GIS Programming at Salisbury University, students evaluated 106 geographically known addresses to determine the match rate and positional accuracy obtained using the Google and the United States Census Bureau geocoding application programming interface (API)s. The results showed that 96.2% of the addresses supplied by the Google API were successfully geocoded, while 84% of the addresses supplied by the Census Bureau API were successfully geocoded.  Further, the Google API matched 90% of the addresses with a ROOFTOP designation.  The average positional accuracy of the Google derived addresses were 80m overall, and 65m for those geocoded with the ROOFTOP designation while the Census Bureau positional accuracy was 271.09m.  

So yeah, this is what you can do with 7 GIS undergraduates at Salisbury University: they work hard, fast, and are a very creative bunch.

paper

 

 

A Detailed Review of MyOpen Source GIS Courses

This review of made by a former student, Brady Woods.  Brady is a GIS Analyst, and after taking my Open Source GIS course, he immediately implemented an Enterprise GIS at his work where he manages 15 GIS technicians.  I was so happy to read his review, but even happier in knowing that a course I created allowed a student to immediately apply the skills in his work environment.  I hope you enjoy reading Brady’s review. – ajl  

My name is Brady Woods, and I am a graduate student enrolled in the Master of Professional Studies: Geospatial Information Science (MPS: GIS) program at the University of Maryland, College Park. I also work as a GIS Analyst at a research institution within the University of Maryland. This past summer the geography department at UMD offered a new selected topics in geography course titled ‘Open Source GIS’ taught by Dr. Lembo. I found the course to be one of the best of my academic career, and would like to provide a review of the course for those who are considering taking Dr. Lembo’s courses either through traditional academic means or through online learning platforms such as Udemy.

The course began with an introduction to open source GIS via QGIS. As Master’s students, most of us were well versed in spatial analysis methods and their applications, especially within the ArcGIS framework. The QGIS assignments revealed the parallels between the two software packages and demonstrated that what we needed to accomplish in ArcGIS could also be done in QGIS – all for the low cost of free. We also touched on the QGIS print composer, a complex suite of cartographic and visualization tools that rivaled the cartography tools available in the ArcGIS Desktop suite.

The remainder of the course was comprised of Dr. Lembo’s Udemy courses, including: Spatial SQL with Postgres: A language for Geographers, Using Open Source Tools to Create an Enterprise GIS, and Internet Mapping with GeoServer, Postgres, and OpenLayers. These courses assumed limited prior knowledge of their subject matter, walking the student through each step and explaining in detail what each step accomplishes. Below are my thoughts on each course:

  • Spatial SQL with Postgres: The objective of this course was for the student to learn spatial SQL through the use of PostgreSQL (FOSS RDBMS) and PostGIS (spatial extension). Throughout the course you will write SQL code alongside Dr. Lembo, working through the basics of SQL all the way to complex spatial analysis methods such as spatial joins. After this course I was able to implement this in my daily workflows, where I now maintain Postgres/PostGIS databases with tables containing millions of records. Spatial operations that would take hours in other GIS products can be accomplished in minutes using PostGIS.
  • Using Open Source Tools to Create an Enterprise GIS: The objective of this course was for the student to build on the QGIS/Postgres/PostGIS already learned in order to construct a multi-user enterprise GIS. Throughout this course you will follow Dr. Lembo as you upload data to a PostGIS database, create and manage multiple users and login roles, and finally testing your enterprise system in a multi-user QGIS editing environment. After this course I was able to implement my own enterprise GIS in my workplace with ~15 users performing various tasks.
  • Internet Mapping with GeoServer, Postgres, and OpenLayers: The final piece of the puzzle. At this point, we now know how to perform desktop GIS analysis and enterprise GIS tasks using FOSS GIS software. Throughout this course you learn how to build an internet map server that serves the data you created in the Enterprise GIS course by using GeoServer to serve the layers, and OpenLayers to display them. After this course I was able provide my workplace with more dynamic cartographic products via basic web map applications compared to static maps.

A final note – although I took these Udemy courses as a geography Master’s student, they are quite accessible and rely on minimal background knowledge to complete. Anyone with an interest in technology and some patience for basic troubleshooting will be able to succeed.  However, if you have an opportunity to take an actual University course from Dr. Lembo, I would highly recommend it, as he adds so much more to the course beyond what you’ll get in the Udemy courses.  Also, Dr. Lembo often spent hours with us before and after the class just shooting the breeze about GIS – it made for a great learning experience.

Scotland GIS tour – please join me!

cam_about-us-1_old_college_0I’m getting very excited about my upcoming trip to Scotland.  I’ve got 3 talks scheduled:

  1.  A seminar on spatial SQL at Stirling University (October 31)
  2. A lightning talk at the 6th Annual QGIS User Group Meeting (November 3)
  3. A seminar on parallel processing at Edinburgh University (November 4)

The Seminar at Stirling University is a talk I’ve given in the past related to spatial SQL being a language for geographers.  However, it will be focused more on biology and natural resources as it is more fitting to the audience.

I’m really excited to attend the QGIS User Group Meeting – looking over the topics, I think this is going to be one of those conferences that I get more than I give.   My talk will be more technical and get into some of the code we developed to create a QGIS Plug-in using GPU technology for terrain analysis.  We made some great progress, but we also discovered we have a lot more to learn :-)

The talk at Edinburgh University should be a fun one.  I’ve titled it Exploring the potential of using your teenager’s gaming computer as a high performance computing (HPC) GIS workstation, and it will cover not only the GPU work I’ve been involved with, but also multiprocessor work I’ve done with Spatial Hadoop and the python multiprocessor object.  The seminar is free and open to the public, and I understand there will be drinks afterwards.

I’ve also scheduled some time with a couple of Professors at St. Andrews University on November 1.

If any of my GIS friends are near the UK during that time, it would be great to get together.  I think I’m also going to take advantage of my down time (although if I agree to any more talks, there won’t be much down time left!), and do a lot of sight-seeing in these beautiful cities.

Finally, I don’t know if the new version of Manifold will be out by then, but if it is, I will try to squeeze in some time to talk with people about it.

Do video lectures help students – Yes!

Have you ever sat in a lecture and been totally confused?  I know I have.  I wished the Professor would either slow down, or that I had a time machine to revisit the lecture.  That made comprehension really difficult.

Recently, I’ve cut down on my peer review publications, and focused more on undergraduate teaching materials.  This included a very readable and applied textbook in statistics and geography for under $50, and a supplementary workbook for under $20.

Last year I decided to create online video lectures for my course Statistical Problem Solving in Geography  through Udemy.  These are abbreviated lectures for all of my book chapters, and they also include hands-on demonstrations of certain calculations.  I offer these lectures free of charge to my students at Salisbury University so that they can supplement the class lectures (I warn them that they are not an excuse for skipping lecture!).  You can see examples of the lectures here.

Over the last 6 years, the first exam in my course is a bit of an eye opener for students.  The class average had ranged anywhere from 62% – 71%.  For the last two semesters, I have provided my students with the supplementary lectures, and the average for the first exam was 82% and 84%!  That is a huge improvement.

Also, what I found when giving my final practicum, these sophomore level undergraduates were able to perform multivariate regression analysis where they not only knew how to eliminate dependent variables that were not statistically significant, they also were able to identify multi-collinearity and using the sequential sums of the squares, determine which explanatory variable should be tossed out – to be honest, I couldn’t even do that in graduate school!!

So yes, this is a great way to supplement student learning, and I really believe that understanding statistical principles is critical for any geographer wanting to perform quantitative analysis or even GIS.

* By the way, if you are interested in learning the material covered in my college level quantitative geography course, you can take Statistical Problem Solving in Geography for the same $25 price I offer it to college students outside of my University by clicking here (the course is normally $75, and I often provide coupons for $30).   You can check discounts out my other online geospatial courses here.

Breaking a vector file up based on a unique attribute

It’s been awhile since I’ve done any Manifold stuff, so I thought I would show you how to break up a vector file based on a unique attribute.  A question was posted on the Manifold GIS forum  asking how to create new drawings (for those of you in an ESRI world – think shapefiles) based on the unique attributes in the database.

This is pretty easy to do, and I think might work really nicely within a Python paradigm (especially if you use psycopg2 that allows you to connect to Postgres).

 

Sub Main
  Set roads = document.ComponentSet("D")
  Set qry = document.NewQuery("Q",true)
  Set qry2 = document.NewQuery("Q2",true)
  qry.text = "select tp from D group by tp"
  for each rec in qry.Table.RecordSet
    thetype = rec.Data("tp")
    Set newdrw = Document.NewDrawing(thetype,roads.CoordinateSystem,true)
    qry2.text = "update D SET [Selection (I)] = true WHERE tp = " & chr(34) & thetype & chr(34)
    qry2.runEx true
    roads.copy(true)
    newdrw.paste
    roads.selectnone
  Next
End Sub

So what’s happening here?  Lines 1 and  17 just start and stop the subroutine.

Lines 2 – 4 set up three objects: a drawing called “D” that we set as a variable called roads; a query component called “Q”, and another query component called “Q2”.  In Manifold, a query component allows you to write SQL queries.

Line 5 – this is a basic SQL query where we are selecting a field called “tp” that has names of the road types (i.e. Primary, Secondary, etc..).  The GROUP BY statement basically just returns the unique road type names.  So, the result would be a table with the unique road types.

Line 6 – This is the start of a for loop.  when we say qry.Table.RecordSet, what we are actually doing is retrieving an object that is the result of the query we issued.  So, we are going to loop over that list of names.  The object we return is called rec and that represents the individual record in the list.

Line 7 – now that we are in the loop, the first thing we do is grab the first unique road type by getting the data from the rec object.  So, for that record, we grab the data value for the “tp” field.

Line 8 – This line creates a new drawing, using the name we got from the record in line 7, and we assign it the coordinate system for the original file we are working on.

Line 9 and 10 – This line creates a query that select all the records in the original drawing that have the unique name in the “tp” field, and then runs the query.  The result is an updated table where the selected records are highlighted.

Lines 11 – 13 – Here we do a little cut-and-paste gymnastics.  Line 11 copies the selected features in the original drawing (remember, these are just the features that have the “tp” name we are interested in for this part of the for loop.  Line 12 pastes those features into the new drawing we created in line 8, and line 13 unselects the lines.

Line 14 – When we get here, we’ve just created our first new drawing with the features that match the first road type.  The Next directive sends us back to the beginning of the for loop and we then process the next road type name.

That’s it.  It is really easy with Manifold scripting.  I’d post the Python solution, but I’m teaching GIS Programming this semester, and I think I’ll leave that to my students to do!

If you are interested in learning how to program with Manifold, Postgres/PostGIS, or build an enterprise GIS, check out my online courses here.