From around the middle of January, I’d been trying to figure out how to make interactive maps. Last month, I put up my first finished map ( I say “my” – I got a lot of help along the way, and those people, who I owe a lot of thanks to, will be acknowledged below). So it looks like it took about two months to figure out how to do it. It felt like rather more. What I’m going to do here is explain a little about what I did, why it didn’t work, what I then did, why that was OK but still needed work, then what I ended up doing. It’s likely to be a long post.
Background – the kind of maps I wanted to draw
What I was keen on doing was coming up with an interactive map where you would colour in areas according to certain population level data – eg unemployment rates for an area, maybe GDP for a country, that kind of thing. I wasn’t looking to draw GIS maps, maps that would help people navigate between places. My interest was more in using the map as a picture, that could tell us about how the characteristics of areas differed.
I also wanted something that looked a bit different from most similar maps online. I don’t much like, for instance, the google fusion approach, where colours are overlaid an existing streetmap. Visually I think it’s not great – it’s a bit cluttered. (Also there’s something about them that is almost misleading – by including streets, and even, on a close zoom, buildings in the picture could make the viewer think that the data applies at the level of the street, or even the building, when actually it’s an average of a larger area). Really what I was looking to do was create something quite clean looking that I could give an identity to.
Like almost all my D3 expeditions, this one begins with Scott Murray’s book, Interactive Visualisation for the Web. There’s a chapter in there about drawing a map of the United States, linking it to some data and then colouring it in nicely.
The key to drawing the map of the United States is getting a file of the map coordinates in JSON format. You link that to the D3 code which does the work of turning these coordinates into lines and borders that form shapes that you can then colour in as you see fit. So what I needed was a JSON file for the UK, showing the boundaries of UK local authorities.
The first place I went was the Ordinance Survey, which gives away loads of boundary data, mainly as shapefiles. Shapefles (.shp files) are what mapinfo and Arc GIS use. They make no sense without the software to read them, and I don’t have the software.
What I had to do was find something to turn these files into JSON files. Scott Murray recommended mapshaper.org, which allows you to drop files in and convert them to other file types.
That’s not the half of it, though. The main thing mapshaper does is allow you to smooth out wiggly lines, of which the UK has loads, both as internal borders between areas and, more obviously, in the form of coastline. Doing this makes the file size much smaller, and so the map loads much quicker which is obviously a good thing.
Anyway, the net result of al this is that mapshaper.org spits out a JSON file which you can point at the code which will just draw you the map you need. Pretty much. Here is my first attempt. I have coloured the authorities in red, as you can see.
A map of English local authorities.
A second map of English local authorities
Clearly, this is a more detailed map. But it’s still not quite right.
What’s going wrong is the projection. Anytime you want to reproduce a 3d picture (which is what a map of the UK is) in 2d, you need to use a projection to map those coordinates. And if you get the wrong one, pretty much anything could happen.
The problem I was having was to do with the coordinates system the Ordinance Survey use, which is unique to its own surveys, rather than a global system. When I put those coordinates into my code template, they got read as the above. Which is obviously hopeless.
A new approach
What I needed, then, were files that used the right kinds of coordinates and actually there are plenty out there. Every time someone makes a google fusion map, if the settings are friendly enough, you can download the file that draws the boundaries. These boundaries are KML files but the key thing is that they are drawn off global coordinates, which are the ones my D3 code knows how to deal with. So we’re getting somewhere here.
Simon Rogers, who used to run the Guardian datablog and now works at twitter, draws lots of google fusion maps, and has a bunch of boundary files on his site, including some for UK local authorities. They’re free to download, so I took them.
All the data I need is in these files. The coordinates are global – you can see, for instance, in the borough of Greenwich that the latitude crosses 0 degrees. There’s a bit of messing about to do to get the right punctuation for a JSON file – JSON uses lots of [square brackets], KML files tend not to – but mostly it’s easy enough.
But after getting one clean line of JSON for each local authority, I ran into a problem. I could see the different LAs on the screen, the boundaries looked fine, but when I tried to colour them in, it would colour in the whole page. I thought that maybe the boundaries weren’t closed, but they kind of are by definition – that’s how the things work, they’re just lists of coordinates and the code links them up.
So, as I always do, I went to Stack Overflow and posted my problem. And as always happened, someone helped me out. They pointed out that the problem was that the boundaries were effectively drawn inside out, so that the areas I was seeing were cut outs from the whole screen; the obverse of what I was looking for. So rather than colouring in the shape of the local authority, I was colouring everything but that shape and after I’d coloured in a couple, the whole screen was coloured in – local authorities, the North Sea, the Channel, Ireland, the lot.
The reason this had happened was that the coordinates were written anti clockwise, whereas the code wanted clockwise coordinates. My stack overflow helper sent me a link to a place that would tell me how to reverse them without tediously going through each one and rewriting them, which would have been impossible. It looked a little complicated, and it was already late in the day so I thanked my correspondent and went to the pub.
By the time I got back from the pub, the same person had responded again to my thank-you post. They said they hadn’t realized quite what they were linking to, and yes it did look rather complicated, and why don’t you try this piece of code I’ve just written especially that reverses all the coordinates for you, and prints the new ones you need underneath the map that demonstrates that this all now works?
I was pretty taken aback, and I am now eternally indebted to AmeliaBR, who, a quick inspection of Stack Overflow reveals, seemingly dedicates her waking life to solving the computing problems of the less able. An absolute star.
With this magic code in hand, I was nearly there. The one small problem remaining were the authorities that are made up of a few different areas – Great Yarmouth straddles a river, for instance, and lots of the South Essex coast includes tiny little islands. They needed to be broken up, reversed, and stitched back together again. With some places, I didn’t bother, and just hacked off the tiny extra parts . Anglessey will just have to do without Holyhead. With others – Great Yarmouth being a good example – you can’t really do that. It took a little while, but it’s done now.
So then you get a map and it looks like this.
A map of English local authorities that actually works
You can play with it a bit – hovering over the areas reveals the values in the tooltip. Off the top of my head, I can’t now remember what the underlying data is – it’s likely to be people claiming Job Seeker’s Allowance or something. Looks like bad news for Hull and Birmingham, whatever it is. I decided to pull out London, as it’s too small to look at in the full map.
I then worked on an interactive version that compared unemployment among under 25s from the 2001 and 2011 censuses. The finished product is below -the interactivity is based on the same ideas as the interactive graphs I’ve been working on – click a button, pass in new data. The design of it was done by my collegue Hannah, who also pointed me in the direction of colorbrewer2.org to choose the colours. You could lose hours to that site, choosing different colour combinations for maps. The one we ended up with goes from green (good) to red (bad).
We also added in a graph showing the distribution of unemployment rates in 2001 and 2011, as a quick way of looking at the whole picture. (That idea came from the Facts are Sacred book, by Simon Rogers) This time, the map is of England and Wales.
Unemployment among Under 25s in the 2001 and 2011 censuses
I think it’s a good visualisation because there’s an obvious story – unemployment gets a lot worse, everywhere. There are geographical aspects to it– the north south split, the deterioration in parts of coastal England, the rural/ urban differences –which a map can show that a graph can’t.
And that, more or less, is it in terms of drawing a map. But if we go back to why I wanted to do this, there was the thing about not looking like google maps. But a far better reason, when I thought about it was that google maps is a bit restrictive. You’re kind of stuck with their layout, it’s hard to annotate the map or add any more to it graphically. So you’ve got a map, but not much more.
Since finishing this map, I’ve done three more quite quickly. The work is setting the thing up – once that’s done, you’ve essentially got a template you can keep dropping stuff into. Then the only additional work depends on what specific changes to the map you want to make – adding in another year’s data, or a different type of data. The map at the bottom here allows you to choose different data within each year, which was a fair bit more work. But again, once that’s done, it’s done.
The maps we eventually put up had a bit more to them; they were more like full visualisations, with words and other info. It’s really good to have that control. And the longer I spend on this stuff, the more I realise that there’s no other way I’d want to do it – to visualise stuff properly you need to be able to use all of the space how you want it.
Edit: I don’t take comments on the blog as dealing with the spam is too much of a pain. You can find me on twitter (all too often) – @tommacinnes