The EU referendum: voting intention vs voting turnout

Next month, the UK is having a referendum on the question of whether it should remain in the European Union, or leave it. All us citizens are having the opportunity to pop down to the ballot box to register our views. And in the mean time we’re subjected to a fairly horrendous  mishmash of “facts” and arguments as to why we should stay or go.

To get the obvious question out of the way, allow me to volunteer that I believe remaining in the EU is the better option, both conceptually and practically. So go tick the right box please! But I can certainly understand the level of confusion amongst the undecided when, to pick one example, one side says things like “The EU is a threat to the NHS” (and produces a much ridiculed video to “illustrate” it) and the other says “Only staying in Europe will protect our NHS”.

So, what’s the result to be? Well, as with any such election, the result depends on both which side each eligible citizen actually would vote for, and the likelihood of that person actually bothering to turn out and vote.

Although overall polling is quite close at the moment, different sub-groups of the population have been identified that are more positive or more negative towards the prospect of remaining in the EU. Furthermore, these groups range in likelihood with regards to saying they will go out and vote (which it must be said is a radically different proposition to actually going out and voting – talk is cheap – but one has to start somewhere).

Yougov recently published some figures they collected that allow one to connect certain subgroups in terms of the % of them that are in favour of remaining (or leaving, if you prefer to think of it that way around) with the rank order of how likely they are to say they’ll actually go and vote. Below, I’ve taken the liberty of incorporating that data into a dashboard that allows exploration of the populations for which they segmented for, their relative likelihood to vote “remain” (invert it if you prefer “leave”), and how likely they are to turn out and vote.

Click here or on the picture below to go and play. And see below for some obvious takeaways.

Groups in favour of remaining in the EU vs referendum turnout intention

So, a few thoughts:

First we should note that the ranks on the slope chart perhaps over-emphasise differences. The scatterplot helps integrate the idea of what the actual percentage of each population that might vote to remain in Europe is, as opposed to the simple ranking. Although there is substantial variation, there’s no mind-blowing trend in terms of the % who would vote remain and the turnout rank (1 = most likely to claim they will turn out to vote).

Remain support % vs turnout rank

I’ve highlighted the extremes on the chart above. Those most in favour to remain are Labour supporters; those least in favour are UKIP supporters. Although we might note that there’s apparently 3% of UKIP fans who would vote to remain. This is possibly a 3% that should get around to changing party affiliation, given that UKIP was largely set up to campaign to get the UK out of Europe, and its current manifesto rants against “a political establishment that wants to keep us enslaved in the Euro project”.

Those claiming to be most likely to vote are those who say they have a high interest in politics, those least likely are those that say they have a low interest. This makes perfect sense – although it should be noted that one’s personal interest in politics of course does not entirely affect the impact of other people’s political decisions that will then be imposed upon you.

So what? Well, in a conference I went to recently, I was told that a certain US object d’ridicule Donald Trump has made effective use of data in his campaign (or at least his staff did). To paraphrase, they apparently realised rather quickly that no amount of data science would result in the ability to make people who do not already like Donald Trump’s senseless, dangerous, awful policies become fans of him (can you guess my feelings?). That would take more magic than even data could bring.

But they realised that they could target quite precisely where the sort of people who do already tend to like him live, and hence harangue them to get out and vote. And whether that is the reason that this malevolent joker is still in the running or not I wouldn’t like to say – but it looks like it didn’t hurt.

So, righteous Remainers, let’s do likewise. Let’s look for some populations that are already the very favourable to remaining in the EU, and see whether they’re likely to turn out unaided.

Want to remain

Well, unfortunately all of the top “in favour to remain” groups seem to be ranked lower in terms of turnout than in terms of pro-remain feeling, but one variable sticks out like a sore thumb: age. It appears that people at the lower end of the age groups, here 18-39, are both some of the most likely subsections of people to be pro-Remain, and some of the least likely to say they’ll go and vote. So, citizens, it is your duty to go out and accost some youngsters; drag’em to the polling booth if necessary. It’s also of interest to note that if leaving the EU is a “bad thing”, then, long term, it’s the younger members of society who are likely to suffer the most (assuming it’s not over-turned any time soon).

But who do we need to nobble educate? Let’s look at the subsections of population that are most eager to leave the EU:

Want to leave.png

OK, some of the pro-leavers also rank quite low in terms of turnout, all good. But a couple of lines rather stand out.

One is age based again; here the opposite end of the spectrum, 60+ year-olds, are some of the least likely to want to remain in Europe and some of the most likely to say they’ll go and vote (historically, the latter has indeed been true). And, well, UKIP people don’t like Europe pretty much by definition – but they seem worryingly likely to claim they’re going to turn up and vote. Time to go on a quick affiliation conversion mission – or at least plan a big purple-and-yellow distraction of some kind…?


There’s at least one obvious critical measure missing from this analysis, and that is the respective sizes of the subpopulations. The population of UKIP supporters for instance is very likely, even now, to be smaller than the number of 60+ year olds, thankfully – a fact that you’d have to take into account when deciding how to have the biggest impact.

Whilst the Yougov data published did not include these volumes, they did build a fun interactive “referendum simulator” that, presumably taking this into account, lets you simulate the likely results based on your view of the likely turnout, age & class skew based on their latest polling numbers.

Accessing Adobe Analytics data with Alteryx

Adobe Analytics (also known as Site Catalyst, Omniture, and various other names both past and present) is a service that tracks and reports on how people use websites and apps. It’s one of the leading solutions for organisations who are interested in studying how people are actually using their digital offerings.

Studying real-world usage is often far more insightful, in my view, than surveying people before or after the fact. Competitors to Adobe Analytics would include Google Analytics and other such services that allow you to follow web traffic, and answer questions from those as simple as “how many people visited my website today?” up to “can we predict how many people from New York will sign up to my service after having clicked button x, watched my promo video and spent at least 10 minutes reading the terms and conditions?”

In their own words:

What is Adobe Analytics?
It’s the industry-leading solution for applying real-time analytics and detailed segmentation across all of your marketing channels. Use it to discover high-value audiences and power customer intelligence for your business.

I use it a lot, but until recently have always found that it suffers from a key problem. Please pardon my usage of the 4-letter “s word” but, here, at least, the Adobe digital data has always pretty much remained in a silo. Grrr!

There are various native solutions, some of which are helpful for certain use cases (take a look at the useful Excel addin or the – badly named in my opinion, and somewhat temperamental – “data warehouse” functionality for instance). We have also had various technology teams working on using native functionality to move data from Adobe into a more typical and accessible relational database, but that seems to be a time-consuming and resource-intensive operation to get in place.

So none of the above solutions yet really proved to meet my needs to extract reasonably large volumes of data quickly and easily on an adhoc basis for integration with other datasources in a refreshable manner. And without that, in this world that ever-increasingly moves towards digital interactions, it’s hard to get a true overall view of your customer’s engagement.

So, imagine how the sun shone and the angels sung in my world when I saw the Alteryx version 10.5 release announcement.

…Alteryx Analytics 10.5 introduces new connectors to Google Sheets, Adobe Analytics, and Salesforce – enhancing the scope of data available for analytic insights

I must admit that I had had high hopes that this would happen, insomuch as when looking at the detailed schedule agenda for this year’s Alteryx Inspire conference (see you there?) I noticed that there was mention of Adobe Analytics within a session called “How to process and visualise data in the cloud”. But yesterday it actually arrived!

It must be said that the setup is not 100% trivial, so below I have outlined the process I went through to get a successful connection, in case it proves useful for others to know.

Firstly, the Adobe Analytics data connector is not actually automatically installed, even when you install even the full, latest version of Alteryx. Don’t let this concern you. The trick is, after you have updated Alteryx to at least version 10.5, is to go and download the connector separately from the relevant page of the Alteryx Analytics gallery. It’s the blue “Adobe Analytics install” file you want to save to your computer, there’s no need to press the big “Run” button on the website itself.

(If you don’t already have one, you may have to create a Alteryx gallery user account first, but that’s easy to do and free of charge, even if you’re not an Alteryx customer. And whilst you’re there, why not browse through the manifold other goodies it hosts?).

You should end up with a small file called “AdobeAnalytics.yxi” on your computer. Double click that, Alteryx will load up, and you’ll go through a quick and simple install routine.


CaptureOnce you’ve gone through that, check on your standard Alteryx “Connectors” ribbon and you should see a new tool called “Adobe Analytics”.

Just like any other Alteryx tool you can drag and drop that into your workflow and configure it in the Configuration pane. Once configured correctly, you can use it in a similar vein to the “Input data” tool.

The first thing you’ll need to configure is your sign-in method, so that Alteryx becomes authorised to access your Adobe Analytics account.

This isn’t necessarily as straightforward as with most other data connectors, because Adobe offers a plethora of different types of account or means of access, and it’s quite possible the one that you use is not directly supported. That was the case for me at least.

Alteryx have provided some instructions as to how to sort that out here. Rather than use my standard company login, instead I created a new Adobe ID (using my individual corporate email address), logged into with it, and used the “Get access” section of the Adobe site to link my company Adobe Analytics login to my new Adobe ID.

That was much simpler than it sounds, and you may not need to do it if you already have a proper Adobe ID or a Developer login, but that’s the method I successfully used.

Then you can log in, via the tool’s configuration  panel.


CaptureOnce you’re happily logged in (using the “User login” option if you followed the same procedure as I did above), you get to the juicy configuration options to specify what data you want your connector to return from the Adobe Analytics offerings.

Now a lot of the content of what you’ll see here is very dependent on your Adobe setup, so you might want to work with the owner of your Adobe install if it’s not offering what you want, unless you’re also multitasking as the the Adobe admin.

In essence, you’re selecting a Report Suite, the metrics (and dimensions, aka “elements”) you’re interested in, the date range of significance and the granularity. If  you’re at all familiar with the web Adobe Analytics interface, it’s all the same stuff with the same terminology (but, if it offers what you want, so much faster and more flexible).

Leave “Attempt to Parse Report” ticked, unless for some reason you prefer the raw JSON the Adobe API returns instead of a nice Alteryx table.

Once you’ve done that, then Alteryx will consider it as just another type of datasource. The output of that tool can then be fed into any other Alteryx tool – perhaps start with a Browse tool to see exactly what’s being returned from your request. And then you’re free to leverage the extensive Alteryx toolkit to process, combine, integrate, analyse and model your data from Adobe and elsewhere to gain extra insights into your digital world.

Want an update with new data next week? Just re-open your workflow and hit run, and see the latest data flow in. That’s a substantial time and sanity saving improvement on the old-style battle-via-Excel to de-silo this data, and perhaps even one worth buying Alteryx for alone if you do a lot of this!

Don’t forget that with the Alteryx output data tool, and the various enhanced output options including the in-database tools and Tableau output options from the latest version, you could also use Alteryx simply to move data from Adobe Analytics to some other system, whether for visualisation in Tableau or integration into a data warehouse or similar.

A use case might simply be to automatically push up web traffic data to a datasource hosted in Tableau Server for instance, so that any number of your licensed analysts can use it in their own work. You can probably find a way to do a simple version of this for free” using the native Adobe capabilities if you try hard enough, but anything that involves a semblance of transform or join, at least in our setup, seems far easier to do with external tools like Alteryx.

Pro-tip: frustrated that this tool, like most of the native ones, restricts you to pulling data from one Adobe Report Suite at a time? Not a problem – just copy and paste the workflow once for each report suite and use an Alteryx Union tool to combine the results into one long table.

Here’s screenshots of an example workflow and results (not from any real website…) to show that in action. Let’s answer a simple question: how many unique visitors have we had to 2 different websites, each represented by a different report suite, over the past week?



Performance: in my experience, although Adobe Analytics can contain a wealth of insightful information, I’ve found the speed of accessing it to be “non-optimal” at times. The data warehouse functionality for instance promises/threatens that:

Because of to the complexity of Data Warehouse reports, they are not immediately available, can take up to 72 hours to generate, and are accessible via email, FTP or API delivery mechanisms.

The data warehouse functionality surely allows complexity that’s an order of magnitude beyond what a simple workflow like this does, but just for reference, this workflow ran in about 20 seconds. Pulling equivalent data for 2 years took about 40 seconds. Not as fast as you’d expect a standard database to perform, but still far quicker than making a cup of tea.

Sidenote: the data returned from this connector appears to come in string format, even when it’s a column of a purely numeric measure. You might want to use a Select tool or other method in order to convert it to a more processable type if you’re using it in downstream tools.

Overall conclusion: HOORAY!