Responsive graphs (not very slight return)

A responsive graph, yesterday.

My last post was about making interactive visualisations (mainly graphs) work on mobiles as well as desktops and laptops. The idea was that the graph should work as a static exhibit with extra interactive functionality should you need it. In that way, the fact that the interaction can be tricky on mobiles doesn’t mean that the mobile version is pointless, just not quite as good.

This was mainly a design issue – make stuff clearer, have a story or finding more prominent in the viz, that kind of thing. But there is a substantial technical task around making the viz responsive – that is, the right size and shape for the device you’re on. Most websites themselves are responsive now – this one is, for instance. If you’re reading on your desktop, there will be around 12-14 words per line. On a mobile, it’s more like 8-10.

The thing is, the graphs themselves come from elsewhere, pulled in using an iframe, and they are not responsive. So every device is trying to draw them as if they were a desktop, meaning on a mobile, the graphs are massive. You can get round that by pinching and dragging etc but firstly that’s really annoying and secondly it means the text, which is responsive, is all unaligned and the whole thing is a mess.

I have tried making responsive graphs before and attempts sit a in a rather large file of “Stuff I tried that didn’t work at all”. (There’s a rather curt post about them here, which you can see demonstrates the point). That was a bit over a year ago, and obviously I wasn’t happy with the solution I found. But one of the things about using D3 is how quickly it’s moving on, how many more people are using it, and so how much better the solutions (free, natch) become.

Some pretty easy googling came up with this piece by Peter le Bek which sets out pretty clearly how to go about it. The idea is dead simple. When you draw graphs, you need to define a space into which you are drawing it – a sort of canvas for it. The dimensions for that canvas are set in pixels. But rather than giving a set number of pixels – it would be around 600 wide for the desktop version of the site, and 360 for the mobile version and now you’re seeing the problem – you can just set the width to be the width of whatever container you’re working in. (Obviously if you want you could set it to be some fraction of that, too). The height then can be set accordingly – you might not want to end up drawing weird looking long and thin graphs for mobiles all the time.

I should point out that the canvas is set according to the website you’re working on, not the device, which is even better. So you can use the same graph across different websites which are sized ever so slightly differently and it adjusts accordingly, just as it would adjust between a desktop and a mobile.

And everything follows from that. So whereas before I was writing code that said (for example) “My canvas is 500 pixels wide and I want my y axis to be 100 pixels from the left” it now says “My canvas is as wide as the website will let it be and I want my y axis to be at 20% of that width”.

That’s all fairly straightforward, but what Peter le Bek suggests is something a bit more interesting. Now that you know how big the container is you’re working in, you can do some very different things according to the device. In his example, he removes all the gridlines and axes for the mobile version to unclutter the viz. I was a bit less ambitious, and just moved the axes a little and adjusted the font size. Anyway, here’s the graph I was working on before, now properly responsive. If you hover over any part of the graph, the title changes to tell you the values you’re looking at.

The same principles can apply to maps, too, so here’s a map from our recent Monitoring Poverty and Social Exclusion report (full report here, with a couple more responsive graphs on the landing page and about a hundred more unresponsive ones in the pdf). If you hover over the map you can seed the value for each area roughly where Ireland ought to be.

So there we go. What I’ve got now is a decent template for graphs that interact if you want them to but don’t have to, and can sit on any device. It’s still fiddlier than I’d like, mainly because labelling the axes and naming the series can vary quite a lot from graph to graph and you need to make allowance for that. So it could be a bit more standardised and I need to work on that.

But the process has been quite revealing and the main thing is about the tooltip. For me, the advantage of looking at graphs on a website is that you can interact with them, and the most basic way of doing that is with a tooltip – hover over the graph, see the value. What I’ve ended up doing, sort of by default, sort of through trial and error, is to promote the tooltip and turn it into a changing title for the graph. The advantage of that is that you can explain a bit better what the numbers mean – it’s this many people in this year, or this proportion of people in this part of the country. So taking the example from the top of the page, if you hover over the dots, rather than a little box popping up with a number in it, there’s a full sentence at the top of the graph explaining what you’re looking at -ie there are more words. Here it is again

So I think I’ve learned two things here. Firstly, by thinking more about the device people use to look at the graphs, I’ve changed the approach for all devices. Secondly, I’ve realised that one of the key things in data visualisation is how you use words alongside the pictures.

Post script: As if to make the point, if you go to this site on a mobile (i.e. the whole site, not just this one post) it’s one thin, unreadable strip. That’s because the other graphs aren’t responsive, so the screen and the rest of the post resizes to it, rather than vice versa. I could go and change all the old graphs to make everything fit. But I have decided to leave them there as a Warning To Others. Also I am too lazy to change them, it’s loads of work, honestly.

Making it mobile

This blog/ project is about making interactive data visualisations, trying to figure out what works and what doesn’t. I think they offer a lot and I think the viewer should expect a lot too- you’re looking at the internet, it moves, you should expect more than you would from a book or newspaper. There’s the potential for an extra dimension and people who are doing data analysis and visualisation should find that exciting. But it’s been bugging me for a while that they don’t quite work on phones – what is a hover on a laptop is a click on a phone and the clicks are hard to click, sometimes it takes a while to load stuff, it’s all too small etc etc. Then I saw this tweet from Randy Olson (Randy moderates the Data is Beautiful Reddit – he knows of which he speaks).

And that made me think rather a lot. So I did some thinking but it came to nothing.

Later on last week I had cause to draw the graph below, which began life as a static graph (Microsoft Excel copied and pasted into Word, as ever). It’s based on Donald Hirsch’s/ End Child Poverty’s new child poverty figures, and it’s basically a cumulative frequency thing – rank areas (in this case, electoral wards) by the proportion of kids in each area who live in poverty, then, moving up the distribution, show the proportion of all children in poverty who live in an area at least as poor as that one. Hover over any part of the graph to see the values etc and so on.

If poor children were equally distributed across all areas, there would be a straight diagonal line from the bottom left to the top right. There’s a curve, obviously, as you’d expect there to be more poor kids in poor areas than non-poor areas (that’s kind of by definition…). The interest, then, is in the steepness of the curve, particularly over on the left, and what that tells us about clustering of poor children.

You can explore the graph and look at different points to see how concentrated the distribution is. So, if you go from the 50% mark on the y axis, you’ll see half of all poor kids live in 22% of areas. Going along the x axis, you’ll see that the poorest 50% of areas have 77% of all poor children. I find the ends of the distribution interesting, too – 20% of poor children in 6% of areas, 10% of poor children in the 30% of least poor areas. And so on.

But going back to Randy’s point, how does this work on a mobile? Not brilliantly, as anyone who is reading this on a mobile can attest. It’s kind of fiddly to get any particular value, as the columns that form the curve are necessarily very thin, since there’s 100 of them fitting into a narrow space. So you’ll end up poking way trying to get close to a number and maybe you’ll find that enjoyable and maybe you wont.

I don’t really know how you get round that – a phone screen just is smaller. But there’s a different way of looking at it – what if the clicking weren’t so essential? We could set the graph up in such a way that it tells the viewer the/ a key fact to begin with, which they can explore further if they want. The graph below basically does that, taking what I think is the most interesting finding and setting that up as the starting state. The relevant lines are highlighted in red, and the tooltip is already set at the top, and in a bigger font, making it more obviously a title for the graph. Otherwise it’s exactly the same as the graph above – you can hover/ click on whatever part of the graph to find out more. It’s just that you don’t have to. Basically, it’s now a static graph you can interact with.

Which isn’t of course to say that this is perfect on a mobile – I’ve checked, it’s not. The font’s still too small, it’s still a bit hard to compare numbers that appear to be quite close together. It’s definitely heading in the right direction, though.

One of the interesting things about making this kind of stuff is that you’re faced with problems all the time – some of them are design problems and some are technical problems. This one looks like a technical problem – the differences across different platforms – but there are things you can do to the design to make it better. The point here being is that there could be some middle ground where there is interactivity for those that can use it but the viz isn’t utterly redundant without- visualisations where interactivity is useful, but not essential. Or, more snappily, VWIIUBNE. That’s bound to catch on.

I don’t take comments on the blog as the spam is too tedious to deal with but I can be found on twitter – @tommacinnes

Scottish independence

I’ve been working on a little interactive of the Scottish independence vote and how it correlates with a few other variables. It’s pretty simple – some correlations of the proprtion voting Yes in each local authority and some of the characteristics of that LA. Here’s where I’ve go to so far. If I can think of any other variables which are distinct enough (e.g. not just breaking down the benefits data, or adding another age group) I’ll pop them in.

Some interesting points (IMO)

1 A really strong link between the Yes vote and the proportion of people on means tested out of work benefits. There’s been analysis elsewhere looking at the links between deprivation and the Yes vote. The deprivation index draws quite heavily on the benefits figures, so I’ve just used the latter

2 A pretty strong link with the proportion of people who live in urban areas. And actually, it was interesting to see that even in the Highlands, half the population lives in an urban area. Presumably that’s Fort William and Iverness.

3 Not a terrible strong link between age and Yes, at the local authority level. Huge caveat needed here. The opinion polls said over 65s were by far the most likely age group to vote No. The reason the effects are weak in the graph is that most LAs have a similar (or at least similarish) age distribution. The proportion of the population aged over 65 is between 14% and 24% everywhere, which doesn’t give us a lot of variation to analyse.

At the moment (and possibly always, I’m not sure) there is data at the local authority level but no lower. Interestingly, Glasgow Council is going to publish data for each of its Holyrood constituencies (there are 8), which would allow us to dig a little deeper, but Edinburgh is not. Using LA level stuff is OK up to a point, but the conclusions you can draw are bound to be weaker than for, e.g., ward level data, never mind individual voters.

More importantly, all we’re doing here is drawing a correlation. It doesn’t prove that e.g. people on benefits were more likely to vote Yes. For what it’s worth I’m pretty sure they were, but all we can say for sure is that people in areas with high proportions of people on benefits were more likely to vote Yes. We can’t say more unless/ until we get a bit more data about the individuals themselves.

The data comes from a few different sources, mainly via Nomis, but I got the urban rural stuff from the Scottish Government here . The work of pulling the voting data together was very helpfully done by Alistair Rae, from Sheffield University. He’s done some similar analysis which is well worth a look, as is this piece in the FT by John Burn Murdoch and Aleksandra Wisniewska


I finally figured out how to make my maps zoomable. It was, compared to actually making a map, really easy -like 6 lines of code. Basically there’s a behaviour in D3 already that allows you to zoom about, you just have to set it to work on the map projection so everything moves together nicely.

It doesn’t interfere with the tooltip or the interactivity or anything. Although the more you zoom in the more the boundaries get a bit weird – there appears to be a lake just north of Croydon, for instance. The example below is based on the young adult unemployment data I got from the census.

This from D3 tips and tricks was that I used – it’s really very helpful indeed

Small data – FiveThirtyEight and predicting the World Cup

A mug’s game

I should really begin with some sort of “full disclosure”. Despite being pretty obsessive about football – I have a season ticket to my home town club despite living a good couple of hours away by train, my favourite TV programme by miles is the Bundesliga highlights on ITV4 – and despite doing “stats” for a living, I am terrible at predicting football matches. I got beat in my World Cup Fantasy Football league by a child. I thought my team, Norwich, would finish comfortably mid-table last season, thanks to goals from our new striker, Dutch international Ricky van Wolfswinkel. We got relegated, and, with his one goal all season, Ricky van Wolfswinkel is no longer a Dutch international.

But that doesn’t mean I didn’t join in the schadenfreude around all those mistaken World Cup stats models when Germany battered Brazil last week. My favourite tweet was this

And I sent this one on my way home from the pub, where I appear to be criticising data viz legend Alberto Cairo (I wasn’t, but still, for shame. Also, for drink).

The main butt of the jokes appeared to be FiveThirtyEight, the data journalism site set up by Nate Silver. They gave Brazil a 65% chance of winning that game. That makes them definite favourites. (Again, full disclosure, I thought Brazil would win too, but we know about me now). Here’s one of many, many tweets.

(Note – that’s a different %age to the one I saw. Anyway)

Most bookies thought the odds were roughly even, with Germany possibly slight favourites. You could find pundits backing both sides, but most thought it would be close. So, in not predicting a 7-1 battering, FiveThirtyEight were far from alone, but they did rate Brazil’s chances higher than most.

So how did this high rating come about? FiveThirtyEight built a model estimating each side’s chances of winning the tournament before it kicked off, and adjusted those chances as the games went on. The high rating for Brazil comes, therefore, from this initial model. They have published the data behind their model, which is both honest and useful of them so we can have a poke around.

Brazil – everyone’s favourites

As a starting point, here’s a graph showing their pre tournament calculations of the probability of each country winning the World Cup. For comparison, we’ve also got the odds, supplied by FiveThirtyEight in their original post, from Betfair, a betting exchange.

There appears to be an outlier.

That 45% chance of Brazil winning the whole thing is pretty high. It’s a 32 team tournament, some of the others are pretty good, saying Brazil were almost as likely as not to win seems pretty brave. It stands out. (Note that as a result of the Brazil figure being so high, the other favourites are necessarily low, so Betfair gives them a greater chance of winning than FiveThirtyEight does).

Yes, Brazil were the favourites with the bookies too. That’s fine, the favourites don’t always win. But the bookies had the chances of winning closer to 20%, half that of FiveThirtyEight. So why were FiveThirtyEight’s ratings so high? There could be a number of reasons, but there seems to be one big one – Home Field Advantage.

There’s no place like etc

Sports teams tend to do better playing at home than away from home, and in football this is more pronounced than in other sports. Nate Silver, in his piece before the tournament, is quite clear that this gives Brazil a big advantage, and that’s the difference between his and other estimates.

One of the reasons teams do better at home than away is that refereeing decisions tend to go in their favour, and it wasn’t hard to spot examples of this in Brazil, from Fred’s hilarious penalty winning dive against Croatia to the non booking of Fernandinho for his continued assaults on James Rodriguez. With my small team supporter’s hat on, I would also point out that these are the kind of decisions that the big sides always get. That may also be irrelevant, but anyway.

Moreover, Brazil are (/ were) indisputably good at home. They hadn’t lost a competitive game at home since the 1970s. No European team had won in Brazil at all since John Barnes ran through the Brazilian defence in 1984. That’s a pretty good record.

But is it that amazing? Firstly, John Barnes’s goal is great (watch it again!) but Brazil rarely played friendlies at home until the run up to this years World Cup, preferring to jet off round the world for oodles of cash instead.

Likewise, competitive international matches are quite rare, so it’s not hard to go a long time without losing. England, rubbish England, who never, ever, ever learn, have only lost twice at home in competitive games since 2000, and both defeats cost the manager their jobs – Keegan’s resignation in the toilets and McClaren’s humiliation by umbrella.


Spain have only lost one competitive game at home since 2000, Argentina two (in normal time plus one on penalties).

Competitive home games in South America are less common, too. Unlike the European Championships, there is no qualification for the Copa America – the entire continent, plus the occasional guest, qualifies and the tournament takes place in one host nation. So if you’re not the host, there are no competitive home games. Brazil hasn’t hosted the Copa since 1989, when, obviously, they won, but that means that the only competitive home games they’ve played since then are World Cup qualifiers. As hosts, they didn’t have to qualify this time round, so that’s no competitive home games since 2009.

Small data

So we’re looking at quite a small dataset here and Brazil might not be quite as good at home as we think. This is, I think, a specific example of a general problem with modeling sports results for quadrennial tournaments –there just aren’t enough data points to go around. International teams don’t play very often, so models such as that used by FiveThirtyEight rely on club statistics. That seems fair enough, but the model only had good stats for a handful of European leagues and none at all for the Brazilian league. They likely underestimated quite how useless Fred was, for instance, since he doesn’t play in Europe.

It’s hard, then, using the data available to quantify home field advantage, and harder still in a tournament setting. There’s some evidence for it – England, France, South Korea and Japan all had their best World Cup results as hosts. There’s some evidence of no effect – no hosts have won the European Championships since France in 1984 and the last two sets of co-hosts have made almost no impact at all. In the Copa America, the hosts have won three out of the 11 tournaments in which there has been a host nation.

If you wanted to play pop sports psychologist, and we all do, you could make a case for home field advantage being weaker in a tournament setting. In a league, and in the continental level knock out club competitions, teams play each other home and away. You can, to a certain extent, cede some advantage as the away side if you know that you’ve got the opportunity to redress the balance back at your place (or, even more so, if you bring such an advantage to the second leg). In a tournament, though, there are no second chances, and we saw Croatia, Chile and Columbia, if not so much Mexico and Cameroon, really take the game to Brazil.

In order to get a handle on the impact of home field advantage at the tournament level, you would need some sort of idea of how a host fared compared to how it would have fared elsewhere. To do that, you’d need an idea of how good the team was anyway and given that we don’t really have enough data on the national sides that would presumably require the kind of individual player stats at club level that we also don’t really have. By this stage, we’re piling small data on small data and it’s starting to creak.

Home field disadvantage?

So that’s the stats. But there was something else about the home field this time – the pressure it appeared to put the Brazilian players under. The game against Germany was a full on 11-man meltdown, which started like this


and somehow became less professional. It’s hard to imagine that such a loss of perspective upon losing a player to injury, or control upon conceding an early goal could have happened in, say, South Africa four years ago.

A lot of the commentary after the game focused on the way the whole side crumbled under the pressure, but this was not just wisdom after the event. This is an interesting piece piece by Juninho, who won the world cup with Brazil in 2002, and was excellent on the BBC throughout the competition, from after the Croatia game where, almost in passing, he compares the current side to the one he played in, saying

“I know (the players) feel they are carrying a lot more responsibility on their shoulders”

Even in the opening game, against Croatia, Neymar was crying during the national anthem. Croatia started much the better side, and scored early due to an own goal from Marcelo. The pressure had been evident from the start and what happened in the semi final was an extreme version of what we’d seen previously.

None of which is to say modeling is stupid because it can’t predict such extreme responses, or such crazy results. Of course it can’t, no one can (actually, almost no one) and in many ways that’s the beauty of the whole thing. But the shortage of data is a problem, if you end up stacking one set of assumptions on top of another. In this case, it leads to Brazil’s home field advantage being inflated, and especially so for a tournament setting. That in turn means their chances of winning are inflated well beyond what most other people think, which means FiveThirtyEight stands out, hence the 140 character teasing in their direction.

Presumably, now we are officially in the era of big data, some of the inputs to these kinds of models will improve, even if the central problem of there only being one World Cup every four years remains unsolved. The interesting thing, then, is what happens now to the weight given to home field advantage in the FiveThirtyEight model. Presumably it gets adjusted down, because Brazil lost 7-1 while playing at home. But what about losing 7-1 because they were playing at home? Can we have a dummy variable for the crushing weight of 64 years of cultural, social and sporting expectation? Because we’ll need it if England host in 2030.

Odds/ ends

I’ve been working on quite a lot of things that I haven’t really managed to finish, so I thought I’d start putting them up here. Often when I’m doing a visualisation I’ll start with the analysis and then do the coding. Sometimes though I want to come up with a particular type of presentation, just to see how it works. Without the impetus of a decent piece of analysis behind it, though, they just languish on my hard drive.

There’s one particular example recently, where I wanted to do something compact for mobiles that would be interactive but simple. I based it on this idea by Scott Murray, which in turn was adapted here. Basically, what you’ve got in the example are three overlapping shapes. As you click one it moves to the front of the pack. It’s a really neat effect and you can try it out below.

At the same time, I was starting to get bothered by how the interactives I’d been working on aren’t great on mobiles. Those maps, for instance, are far too detailed for a 4 by 3 inch screen. All the clicking is really fiddly, the mouseover is actually a touch on a touch screen, which is also fiddly.

So this presentation seemed to be a good solution to that problem. You can have big things to click, but they overlap so they don’t take up much space. I thought that maybe you could use it for pie charts, showing each slice separately, which would give the slices more room. When I was working on it there had been some stats out about Food Banks and who was using them so I used those numbers. The text is too big cos I never got round to sizing it properly. And the colours are horrendous cos I never got round to doing nicer ones. Anyway.

What I thought, and still think, about this, is that it’s OK but a bit pointless. It seems a lot of work – a click! – to get one number. And there are only four numbers in the whole thing, including the total, that you see when you open it up. So I’m not really convinced it’s worth it. Also, it sort of doesn’t really work on a mobile. There’ some technical stuff I can’t quite crack about getting it to fill the screen and not doing this weird blinking thing on each click that happens at the moment. So it’s not great all round.

In retrospect, what happened here is that I had a solution – this nice bringing to the front thing that someone else had developed – and I tried to find it a problem to solve. That’s unlikely to be the right way round. I can see that the effect might be useful as a smaller part of something else, and I now have that on hand should I ever need it. But I’ll use it because I need it, not because I’ve got it.

Let’s draw maps

From around the middle of January, I’d been trying to figure out how to make interactive maps. Last month, I put up my first finished map ( I say “my” – I got a lot of help along the way, and those people, who I owe a lot of thanks to, will be acknowledged below). So it looks like it took about two months to figure out how to do it. It felt like rather more. What I’m going to do here is explain a little about what I did, why it didn’t work, what I then did, why that was OK but still needed work, then what I ended up doing. It’s likely to be a long post.

Background – the kind of maps I wanted to draw

What I was keen on doing was coming up with an interactive map where you would colour in areas according to certain population level data – eg unemployment rates for an area, maybe GDP for a country, that kind of thing. I wasn’t looking to draw GIS maps, maps that would help people navigate between places. My interest was more in using the map as a picture, that could tell us about how the characteristics of areas differed.

I also wanted something that looked a bit different from most similar maps online. I don’t much like, for instance, the google fusion approach, where colours are overlaid an existing streetmap. Visually I think it’s not great – it’s a bit cluttered. (Also there’s something about them that is almost misleading – by including streets, and even, on a close zoom, buildings in the picture could make the viewer think that the data applies at the level of the street, or even the building, when actually it’s an average of a larger area). Really what I was looking to do was create something quite clean looking that I could give an identity to.

Starting out

Like almost all my D3 expeditions, this one begins with Scott Murray’s book, Interactive Visualisation for the Web. There’s a chapter in there about drawing a map of the United States, linking it to some data and then colouring it in nicely.

The key to drawing the map of the United States is getting a file of the map coordinates in JSON format. You link that to the D3 code which does the work of turning these coordinates into lines and borders that form shapes that you can then colour in as you see fit. So what I needed was a JSON file for the UK, showing the boundaries of UK local authorities.

The first place I went was the Ordinance Survey, which gives away loads of boundary data, mainly as shapefiles. Shapefles (.shp files) are what mapinfo and Arc GIS use. They make no sense without the software to read them, and I don’t have the software.

What I had to do was find something to turn these files into JSON files. Scott Murray recommended, which allows you to drop files in and convert them to other file types.

That’s not the half of it, though. The main thing mapshaper does is allow you to smooth out wiggly lines, of which the UK has loads, both as internal borders between areas and, more obviously, in the form of coastline. Doing this makes the file size much smaller, and so the map loads much quicker which is obviously a good thing.

Anyway, the net result of al this is that spits out a JSON file which you can point at the code which will just draw you the map you need. Pretty much. Here is my first attempt. I have coloured the authorities in red, as you can see.

A map of English local authorities.

Screen Shot 2014-03-09 at 18.37.59
Which is pretty good as a first try. Here’s a second effort

A second map of English local authorities

Screen Shot 2014-03-09 at 18.37.35

Clearly, this is a more detailed map. But it’s still not quite right.

What’s going wrong is the projection. Anytime you want to reproduce a 3d picture (which is what a map of the UK is) in 2d, you need to use a projection to map those coordinates. And if you get the wrong one, pretty much anything could happen.

The problem I was having was to do with the coordinates system the Ordinance Survey use, which is unique to its own surveys, rather than a global system. When I put those coordinates into my code template, they got read as the above. Which is obviously hopeless.

A new approach

What I needed, then, were files that used the right kinds of coordinates and actually there are plenty out there. Every time someone makes a google fusion map, if the settings are friendly enough, you can download the file that draws the boundaries. These boundaries are KML files but the key thing is that they are drawn off global coordinates, which are the ones my D3 code knows how to deal with. So we’re getting somewhere here.

Simon Rogers, who used to run the Guardian datablog and now works at twitter, draws lots of google fusion maps, and has a bunch of boundary files on his site, including some for UK local authorities. They’re free to download, so I took them.

All the data I need is in these files. The coordinates are global – you can see, for instance, in the borough of Greenwich that the latitude crosses 0 degrees. There’s a bit of messing about to do to get the right punctuation for a JSON file – JSON uses lots of [square brackets], KML files tend not to – but mostly it’s easy enough.

But after getting one clean line of JSON for each local authority, I ran into a problem. I could see the different LAs on the screen, the boundaries looked fine, but when I tried to colour them in, it would colour in the whole page. I thought that maybe the boundaries weren’t closed, but they kind of are by definition – that’s how the things work, they’re just lists of coordinates and the code links them up.

So, as I always do, I went to Stack Overflow and posted my problem. And as always happened, someone helped me out. They pointed out that the problem was that the boundaries were effectively drawn inside out, so that the areas I was seeing were cut outs from the whole screen; the obverse of what I was looking for. So rather than colouring in the shape of the local authority, I was colouring everything but that shape and after I’d coloured in a couple, the whole screen was coloured in – local authorities, the North Sea, the Channel, Ireland, the lot.

The reason this had happened was that the coordinates were written anti clockwise, whereas the code wanted clockwise coordinates. My stack overflow helper sent me a link to a place that would tell me how to reverse them without tediously going through each one and rewriting them, which would have been impossible. It looked a little complicated, and it was already late in the day so I thanked my correspondent and went to the pub.

By the time I got back from the pub, the same person had responded again to my thank-you post. They said they hadn’t realized quite what they were linking to, and yes it did look rather complicated, and why don’t you try this piece of code I’ve just written especially that reverses all the coordinates for you, and prints the new ones you need underneath the map that demonstrates that this all now works?

I was pretty taken aback, and I am now eternally indebted to AmeliaBR, who, a quick inspection of Stack Overflow reveals, seemingly dedicates her waking life to solving the computing problems of the less able. An absolute star.

With this magic code in hand, I was nearly there. The one small problem remaining were the authorities that are made up of a few different areas – Great Yarmouth straddles a river, for instance, and lots of the South Essex coast includes tiny little islands. They needed to be broken up, reversed, and stitched back together again. With some places, I didn’t bother, and just hacked off the tiny extra parts . Anglessey will just have to do without Holyhead. With others – Great Yarmouth being a good example – you can’t really do that. It took a little while, but it’s done now.

So then you get a map and it looks like this.

A map of English local authorities that actually works

You can play with it a bit – hovering over the areas reveals the values in the tooltip. Off the top of my head, I can’t now remember what the underlying data is – it’s likely to be people claiming Job Seeker’s Allowance or something. Looks like bad news for Hull and Birmingham, whatever it is. I decided to pull out London, as it’s too small to look at in the full map.

I then worked on an interactive version that compared unemployment among under 25s from the 2001 and 2011 censuses. The finished product is below -the interactivity is based on the same ideas as the interactive graphs I’ve been working on – click a button, pass in new data. The design of it was done by my collegue Hannah, who also pointed me in the direction of to choose the colours. You could lose hours to that site, choosing different colour combinations for maps. The one we ended up with goes from green (good) to red (bad).

We also added in a graph showing the distribution of unemployment rates in 2001 and 2011, as a quick way of looking at the whole picture. (That idea came from the Facts are Sacred book, by Simon Rogers) This time, the map is of England and Wales.

Unemployment among Under 25s in the 2001 and 2011 censuses

I think it’s a good visualisation because there’s an obvious story – unemployment gets a lot worse, everywhere. There are geographical aspects to it– the north south split, the deterioration in parts of coastal England, the rural/ urban differences –which a map can show that a graph can’t.

And that, more or less, is it in terms of drawing a map. But if we go back to why I wanted to do this, there was the thing about not looking like google maps. But a far better reason, when I thought about it was that google maps is a bit restrictive. You’re kind of stuck with their layout, it’s hard to annotate the map or add any more to it graphically. So you’ve got a map, but not much more.

Since finishing this map, I’ve done three more quite quickly. The work is setting the thing up – once that’s done, you’ve essentially got a template you can keep dropping stuff into. Then the only additional work depends on what specific changes to the map you want to make – adding in another year’s data, or a different type of data. The map at the bottom here allows you to choose different data within each year, which was a fair bit more work. But again, once that’s done, it’s done.

The maps we eventually put up had a bit more to them; they were more like full visualisations, with words and other info. It’s really good to have that control. And the longer I spend on this stuff, the more I realise that there’s no other way I’d want to do it – to visualise stuff properly you need to be able to use all of the space how you want it.

Edit: I don’t take comments on the blog as dealing with the spam is too much of a pain. You can find me on twitter (all too often) – @tommacinnes


This is a stacked bar graph showing changes over time in some (made up) measure, broken down into its component parts (Things A to D). The static version would be OK, but a bit limited – it would show a total pretty well, but it’s hard to compare the changes in the component parts as (with the exception of whichever series you choose to go along the x axis) they all start at different heights. So what this graph does is allow you to choose the component parts you’re interested in and just compare those. That allows the user to identify which parts are driving the change in the whole. Click the squares at the top to show or remove the different slices of the chart in whichever combination you like (the numbers are nonsense, obviously).

Most of the graphs I’ve done so far allow the user to look at one thing, then another thing, then back to the first thing, maybe via some third thing, the advantage being that it’s all in the same space on the screen. That’s handy. But this allows the user to look at things in different ways, according to their preference, and that’s subtly different. And, I think, better.

(edit: Just struck me that maybe the difference is just quantitative, as there are simply more options to view. This graph has 4 different views, for all its seeming complexity. The one above has 15).

As always, it’s done using D3, and builds on the principles in this graph. The main challenge is getting the bars to move around when you remove others, but that’s pretty straightforward after a fashion. The code is a bit repetitive, but the main thing I wanted to avoid was writing code for every possible permutation of the four components separately and I managed that so I’m happy enough.