How do we know if our law libraries, in the early 21st century, are best positioned to serve our modern clientele? When you consider that many law libraries found their homes in courthouses in the early 20th century, and the changes in the profession since then, it makes you wonder if we have the right locations.
Location, Location, oh Hell with it
If we were fast-food restaurants, we would know. In fact, not only would we know whether our current location was right, we would move if it wasn’t. But how do largely non-profit or government law libraries make this type of decision? Is there data we can use?
I think there is and I found or created a number of Microsoft Excel spreadsheets that included data based on zip codes (post codes). My goal was to create a map that (a) showed the density of lawyers overlayed with (b) locations of law libraries. Excel has a Power Map tool available for download but I was not able to get what I wanted from it.
Here’s what I tried next.
Density Map Tools
The goal was to create a heat map. But using a real map, not just a colored table. The term of art is choropleth all though it took me an age to figure that out. I’m not sure in the end a choropleth is exactly what I was looking for but it certainly helped to hone my searching.
OpenHeatMap is a free online tool that I really liked. The developer has pretty clear documentation, you label your columns in Excel to match the specific information his tool accepts, and you upload your file. You can then see a Flash-based map based on OpenStreetMap data. Unfortunately, while the tool worked, my file was so large that it kept crashing my Flash plugin and so was unwieldy.
Google Fusion Tables
I tried Google Fusion Tables next. This is a beta project from Google and is free. It has one significant upside, which is that you can geo-reference your data file using Google Maps API. If you only have a list of postal or zip codes, this can be a huge time saver. It will look at your zip code, and return a latitude and longitude for your location.
There is a cap to how many references Google Maps API will geo-reference, which seems to be 2500 in 24 hours unless you have a premium account. Using Google Fusion Tables seems to bypass this, as I was able to get 10,000 rows geo-coded without having any type of special account.
Unfortunately, I didn’t get the results I expected. I take complete responsibility for this myself but all of my 20,000+ locations ended up in one of two places in the Atlantic ocean. When I ran the data through a second time, I ended up with better plotting but still in the wrong place.
I decided I needed a tool that was clearer to me. I moved on to Tableau.
Intermission: I Need Latitude and Longitude
Google’s Map API provides this information within Google Fusion Tables but I was working with solely post codes when I started over. The challenge was how to get the latitude and longitude for a long list of zip codes. The answer was surprisingly simple.
First, I grabbed the Canadian post codes data from Geocoder.ca. This data file is a list of post codes, latitude, longitude, municipality, and province. I opened it up in Microsoft Excel and cut out all of the non-Ontario data. Easy enough. Now to filter out the postal codes that did not occur in my OWN list.
This means that if postal code K0A 1A0 (cell F2) occurs in any cell in column A (that’s the A:A in the MATCH statement), insert the Latitude (from B:B) for that row. You can continue it across for the other cells, changing B:B to C:C for Longitude (use =INDEX(C:C,MATCH(F2,A:A,0)) ), and D:D (Municipality). The MATCH statement stays the same, because you’re testing to see if the post code in column F exists in column A.
Once you’ve set up the formulas in the first row, you can drag all three down your entire worksheet. Use the fill handle and it will update each row properly. You will get a load of N/A errors if your post code isn’t a match. This helped me since I didn’t have an Ontario-limited list, and it got rid of post codes that might have been from different provinces or countries.
A tip: WAIT. Once you have dragged your fill handle down over your sheet, in my case over 20,000 rows, Excel gets to work. It can take a long time (10s of minutes) for it to complete its work. Leave it alone. I learned this when I saw one city name populated for over half of my rows, clearly a mistake. But Excel just hadn’t powered through all the rows yet. When I gave it time, it completed successfully. I could then cut and paste my columns (F, G, H, I, J) into a new worksheet to use with the mapping tools.
Tableau Public (Community)
Tableau was one of a number of programs that I came across with fee-based applications. At this point, I was still looking for a free option for what was likely to be a one-off project. Tableau has a free desktop application called Tableau Public and a community to go along with it. This sold me, particularly when I got into the tool and saw how powerful it could be.
The first challenge I had was that it didn’t recognize any of my locations. The zip code/postal code function defaults to the US. Once I flipped this dropdown to Canada, it started to map properly.
But it still didn’t get all of my locations. In fact, I went from 22,000 errors to more than 6,000 errors. In this case, it could not find the municipalities that I was referencing. But I know that these exist. As you can see below, Alliston is a town that is near Barrie but there is no corresponding match in the pre-populated list from Tableau.
The mapping in Tableau is great and, I think, it is possibly the best tool that I used while trying to create this heat map. In the end, though, I was only able to map most of my points and not all of them.
CartoDB is a web-based mapping tool. It is the easiest tool that I found although the free version requires you to make your maps publicly accessible. It’s a pretty steep monthly subscription to be able to lock them down. So while I ended up using it to create my map, the pricing for my one-off project means I’ll be heading back to Tableau to spend more time understanding whether I can go around its limitations.
There really isn’t much to say about CartoDB, which says enough. The interface is easy to use and I quickly uploaded my initial dataset (which would be the density file) plus two additional files of geo-located sites: law libraries and courthouses. It was a matter of minutes to figure out how to overlay these three files and to give them different looks. The buildings received a single icon or dot, while the density map had a color range.
Using maps can make data more interesting where it relates to a place. It takes the numbers out of a spreadsheet column and gives you some perspective. CartoDB’s ability to easily layer this data means you can contrast a variety of datasets.
Now that I have created the visual, I am not sure what I will do with it next. The project was initiated based on a question, not only of what the density and proximity was, but whether I could do it. I need to now spend some time looking at the visualization of the datasets – which are already generating more questions – and see if I can come up with any possible answers.