When charts bore you, create a map!
As I continue my sabbatical, one of the projects I am working on is earning a certification in Geographic Information Systems (GIS). Why? For much of my career I have been creating charts… charts showing revenue growth over time, charts showing sales rep performance, and charts showing the health of a SaaS startup. After 15-years of charts, I felt it was time to explore another form of data visualization.
I enrolled in the GIS Specialization offered on Coursera and created by UC Davis. I am all too familiar with data transformation, data management, and blending of data, so I was really curious how different a GIS would be compared to the likes of Salesforce.com of Tableau (which does offer mapping).
The Fundamentals of GIS course itself is much more about learning to use ArcGIS and ArcMap. We did learn about projections, GIS best practices, what spatial analysis really is and how to open the ArcMap software. Aside from learning the tool and file types, there really wasn’t much different from what I already knew as a data and insights analyst.
Take the final peer-graded assignment for Fundamentals of GIS as an example. The course provides data including a counties data file defining counties in California. The course also provides a second data file including voting precincts and the voting results for Prop 37. The goal is to combine the two data files and create a normalized map showing the ratio of Yes votes to total votes. Seems simple enough?
It was fairly simple. As with any data related project, the first thing you do is to download and validate the data. Can you open the zip files? Is the data there in its entirety? Once you know the data is usable, get to know the data. Look at the metadata to see what fields are included and what they mean. Since we have two files which need to be combined, we need to find a primary key to join them.
While it took me a few minutes to review the data, it took a bit longer to understand the connection between the two data sets. It was clear that we needed to have a one to many join and a spatial join. There are a few different ways to do this. I first decided to summarize the precinct data and output a table which showed the total votes per precinct. I can then join this table to the Counties data as a one to one join.
Alternatively, you can join the two data sets using the Spatial Join Tool. Instead of joining on a common key (I joined on County number), you can join them based on their proximity such as an intersect or contains.
Once the data is ready to display on the map, you can use “Symbology” of the joined data layer to display a normalized ratio. Showing absolute numbers of Yes votes does not really tell the whole story as some precincts and counties have greater populations. Normalize the Yes votes by calculating the ratio of Yes votes to total. This produces the map we were looking for. Once we add the required metadata, scale, etc, we can export it. (view my map online here)
What did I learn from taking this course? Spatial analysis is a specialized field which does not differ too much from more traditional data analytics. The course taught me the special files formats, terminology, and ArcGIS basics. What is most interesting to me is this map could be made with other platforms like Tableau and PowerBI. The only difference is the data must be manipulated outside the software (in Excel, maybe) and then visualized.
This brings up a great point. In traditional analytics and business intelligence, you work with specialized tools which handle a specific part of data. From the ETL (Talend or Kettle) to analysis (Excel or Python) to visualization (Tableau or Qlik), each segment of the data journey required different software. Today, the lines are blending a bit. Solutions like Alteryx combines ETL with analysis, but leaves a lot to be desired in terms of visualization. Tableau is also able to connect to and blend a variety of data sources, but leaves some to be desired in analysis.
After taking this course I am left with a profound sense of how specialized GIS is. I can understand why it is well worth investing in, especially for geospatial analysis consisting of multiple data layers. When you consider ArcGIS (or GIS in general) is capable of global level analysis, it takes your breathe away.
My eyes are open to how I can leverage GIS and merge it with my interest in History. Perhaps creating a historic spatial database which illustrates the speed at which Manifest Destiny occurred? Maybe we can start with a map of Texas and how it was settled over time? Stay tuned…