Do Millennials have no friends?

I recently read an article claiming that 22% of Millennials say they have no friends. And then many other articles with the same figure. This made me feel sad. Some of the articles further distinguished between “close” and “best” friends, so here we’re presumably talking about just any friend of any level at all.

Sure, being a human I have felt peaks and troughs of loneliness over my life-history to date, but I’m not sure I’d cope well if I felt like I had no-one in the ‘friend’ category at all. The thought that nearly a quarter of the young-but-definitely-adult generation feel that way today was quite shocking and depressing to me.

Outside of my personal feelings, it is an increasingly well established fact that loneliness – which I have to imagine strongly associates with having no friends – is not only extremely unpleasant for most folk, but actually harmful to ones physical and mental health. People with stronger social connections may literally live longer. One can go too far with the ‘you may as well take up smoking’ type headlines, but there does seem to be something potentially life-and-death within this subject.

But before getting too upset for the local young adults, I did want to check in on the data itself. Millennials do get a bad rap. Most famously perhaps, we’re all supposed to believe that the reason young people don’t own houses is nothing to do with the fact houses cost an insane amount of money, and everything to do with their high expenditure on avocado toast. Somehow a stereotype seems to have developed in some quarters that the reason not every millennial has a job is because they’re lazy (nothing to do with the supply side of the job market, naturally), they’re selfish, narcissistic and constantly going around maliciously killing various industries and destroying other venerable and much-loved institutions, including DUIs, divorce and porn. After all of that, headlines that involve the word “Millennials” do tend to induce a slight level of skepticism in me.

Reassuringly, it turns out friendship data was from a survey conducted by a reputable enough survey company, YouGov. And they were good enough to publish the full, albeit heavily aggregated, results of the survey itself. OK, in a horrible PDF format, but it didn’t take too long to extract the details of the folk who responded ‘zero’ to the “How many friends do you have?” question in a way conducive to constructing a few breakdowns of these folks below, and satisfying a bit of personal curiosity.


TLDR: Yes, 22% of Millennials did say they had no friends, the highest of all surveyed generations. But it’s not clear to me at all that it’s because they’re Millennials. For example, 27% of black people said the same. And what does ‘friend’ even mean in this survey?


Have some reading time on your hands? Well, in accordance to the guidance given in the original data file, any groups where the number of participants surveyed was less than 50 will not be shown, because these very small samples are considered by YouGov to be statistically unreliable. I will however note what the ‘missing’ categories are, in case it helps clarify who is or isn’t in each category.

Unfortunately there didn’t seem to be a whole lot of other statistical significance info in the data file; no confidence intervals or the like. So it’s not clear to me to what extent small percentage differences should be considered “real”. But they are a reputable enough company who have at least taken the time to re-weight the respondents to represent a base of all US adults and talk about the limitations of too-small samples, so I’m going to go wild and assume that we might care about at least the larger differences.

By generation

Missing categories:
– Gen Z (people born in the year 2000 and later)
– Pre-Silent generation (1927 and earlier)

So this is the data the articles focussed on. Sure enough, Millennials were more likely to report having no friends than the other groups. Are we seeing a uniquely lonely generation? Well, it’s possible. However, to be honest, it’s not possible to tell if Millennials are “special” here from this data.

There are other potential explanations, including that – by definition – each generation here must have been a different age when they were surveyed.

The survey was carried out in 2019, and YouGov here defines a Millennial as being someone born between 1982 and 1999 (the exact definition varies depending on who you ask – so always best to check the data source!). So these folk were between 20 and 37 years old. Compare that to the seemingly more friend-enabled ‘Silent Generation’, who in this analysis would have been between 74 and 91 years old.

Perhaps – and I’m not presenting any evidence here to suggest you should believe this over any other hypothesis – it’s just normal that older people are less likely to report having no friends than younger people.

Are changes in the number of people reporting having no friends really a ‘cohort effect’, which is what a lot of the headlines about this survey imply? More data-digging would be needed to determine that, as opposed to whether this is, for instance, an aging effect.

An aging effect is a change in variable values which occurs among all cohorts independently of time period, as each cohort grows older.

A cohort effect is a change which characterizes populations born at a particular point in time, but which is independent of the process of aging.

A period effect is a change which occurs at a particular time, affecting all age groups and cohorts uniformly.

Source: Distinguishing aging, period and cohort effects in longitudinal studies of elderly populations

By gender

Not a whole lot to say here. Males seem slightly more likely to report having no friends than females, but the gender differences are much less than between generations. Without knowing the confidence intervals of the responses it’s also hard to know how significant these differences are.

By region

Again, only relatively small differences are seen when the respondents are split up into what region of the US they live in.

By race

OK, here are some large differences again!

The difference between Black and White respondents – 16 percentage points – is actually the same level of difference as between Millennials and the generation with the very lowest % of people reporting having no friends.

It’s interesting that many of the articles reporting on this survey focused on the generation as opposed to the race. There may be a legitimate reason why, but it’s not self-evident to me. It seems if we’re worried that Millennials may be lonely, the same concern might be needed for non-white folk too.

By education level

The big differences keep on coming! At a glance, this looks like a strong positive link between having higher levels of education and having a friend.

By income

Can money buy you friends? Traditionally we tend to say no. But having a higher income sure does seem to reduce the likelihood of you feeling like you have no friends at all.

(I am not sure exactly what income was asked for – from the values, I’d assume this is something like annual household income, but should verify before stating that to be the case!)

By urbanity of area lived in

The traditionalist’s view that city living includes being surrounded by hordes of other people, but feeling personally lonely, seems directionally borne out in these results. Univariately at least, urban dwellers are more likely to report having no friends.

By marital status

Missing categories
– Civil partnership
– In a relationship, not living together
– Separated
– Other
– Prefer not to say

So is getting married the end of all friendships, with the happy couple dumping their pals so as to get on with journeying their way through mortgages, careers and other misc adulting? Seemingly not. People who are, or were, once married were a lot less likely to report having no friends than those who were not.

By whether being a parent or guardian of any children

Missing categories:
– Don’t know / Prefer not to say

Likewise, parenting doesn’t appear to remove your entire friendship circle (or at least if it does, maybe you end up replacing them all with new parenty-friends over the years). Having kids, especially ones who are now adults, seems to make you less likely to report having no friends.

So, to summarise:

Are millenials more likely than other well-represented generations to report having no friends in this survey? Yes, they are. But we need more data to understand if this is a “Millennial generation” phenomenon vs a “being in your 20-30s” phenomenon. After all, a 40-year-old has had longer to find a friend out there in the wilderness!

Millennials aren’t the only group to report having no friends

Applying the same analysis to the rest of the survey results shows the existence of several other ‘risk factors’ for reporting no friends.

Excluding the variables that show only a couple of percentage point differences between categories, these added-risk groups include:

  1. not being white
  2. having a low level of education
  3. having a low income
  4. living in an urban area
  5. not being or having been married
  6. not having children, especially of adult age

So there’s potential for confounding here. Let’s imagine a world where being born in the 1980-90s did not actually affect your friend count. If any of the above factors are over-represented in Millennials in comparison to other groups, we could still see the same overall effect.

I don’t intend to dig up all the stats correlating age with the 6 bullet points above for this post, but even a modicum of websearching reveals sensible-sounding sources with claims like:

Relative to members of earlier generations, millennials are more racially diverse, more educated, and more likely to have deferred marriage; these comparisons are continuations of longer-run trends in the population. Millennials are less well off than members of earlier generations when they were young, with lower earnings, fewer assets, and less wealth.

Source: Are Millennials Different?

So that’s risk factors 1, 3 and 5 confounding away, mitigated perhaps by reverse-risk factor 2.

For reasons of age alone, it’s unlikely many millennials have adult children yet, and they don’t seem to be in a particular hurry to have any children at all. All this, whilst enjoying urban life, if they’re able to.

So, how to differentiate the root cause? Well, with the level of data published – and don’t get me wrong YouGov, I’m grateful any was! – it’s not really possible to. A more complex analysis using data at the individual person level, allowing us to look at the effect of generation controlling for other variables, and ideally comparing also with previous time series, would be the obvious start. Whilst that type of observational study is usually not able to prove causation beyond doubt, we might get closer towards understanding the likely fundamentals.

It didn’t escape my notice – although of course I am not going to prove this to be true here – that many of the higher risk groups are those that society has often appeared to value less highly – poor people, non-white people, less-educated people, the unmarried; think of the sections of society sometimes defined by or over-represented in low “socioeconomic status” groups. Perhaps policies designed to assist those our current social structure apparently does not help so much may have a bonus side effect in the realm of strengthening social connections – and all the health and life benefits that go alongside that.

What is a friend anyway?

After discussing this statistic with a friend (see what I did there? True story though, and thank you, correspondent) I was reminded that the definition of a friend is itself rather woolly. One person’s friend is another person’s acquaintance, colleague or window-cleaner.

“Friendship is difficult to describe,” said Alexander Nehamas, a professor of philosophy at Princeton, who in his latest book, “On Friendship,” spends almost 300 pages trying to do just that. 

Source: Do Your Friends Actually Like You?

So another scenario in which millennials could be more likely than others to report having no friends – even if in reality they had the same level of social connections – would be if they define ‘friend’ differently, especially more stringently, than others. Some more qual-side digging into whether disparate generations define friends differently to each other would be useful to look into that hypothesis.

Related to definitions, there could also be something in how the question was asked. I did not see the original survey, but the Yougov write-up implies the question asked was

“Excluding your partner and any family members, how many of each of the following do you have?”

followed by a list comprising of “acquaintances”, “friends”, “close friends” and “best friends”.

Most of the articles focus only on the “friends” result, where the 22% zero figure is seen. OK, fine – taken in isolation, “close friends” and “best friends” sound like subsets of friends, right?

But if presented with each of those options on the same screen, perhaps respondents might categorise each person they know into an exclusive category. So if you pop your pal Jimmy into the “best friends” box, perhaps you don’t also add him into the basic “friends” box.

Perhaps you feel close to all your friends, so have 10 close friends but no “non-close” friends, except those you categorise as acquaintances. In this way, a very close-friend-fulfilled person might be included in the no friends bucket when analysed one question at a time.

If this was the case, whilst the articles aren’t reporting anything untrue; it may be misleading when taken in isolation. All this is only a theory of course, as I haven’t seen the precise flow of the original survey. Perhaps the questions were asked in a a way less likely to cause this issue. But we can tell that the 4 categories aren’t being used entirely as subsets of each other, as 25% of Millennials report having no acquaintances, vs only 22% having no friends; i.e. more Millennials have friends than acquaintances.

(Out of interest, 27% of Millennials reported no close friends, and 30% having no best friends – and yes, pedants, some people did report having more than one ‘best’ friend).

Anyhow, wild hypothesising aside: More knowledge does give us a higher chance of developing effective remedies, if remedies are indeed needed. Which they likely are, in my opinion, no matter what the precise count of Millennials involved is or why, given the dramatic impact of loneliness on people’s lives. This is an important line of research that should likely be pursued with the full resources and rigour that serious issues around health and well-being deserve.

But in the mean time, associations between loneliness and all sorts of negative health and well-being effects have been repeatedly demonstrated. So if we’ve societal levers to pull, or personal practices to enact that have the potential to reduce any level of friendlessness, let’s get on and do it.

Advertisements

Future features coming to Tableau 10.2 and beyond – that they didn’t blog about

Having slowly de-jetlagged from this year’s (fantastic and huge) Tableau conference, I’d settled down to write up my notes regarding the always-thrilling “what new features are on the cards?” sessions, only to note that Tableau have already done a pretty good job of summarising it on their own blog here, here and here.

There’s little point in my replicating that list verbatim, but I did notice that a few things that I’d noted down from the keynote announcements that weren’t immediately obvious in Tableau’s blog posts. I have listed some of those for below for reference. Most are just fine details, but one or two seem more major to me.

Per the conference, I’ll divide this up into “probably coming soon” vs “1-3 year vision”.

Coming soon:

Select from tooltip – a feature that will no doubt seem like it’s always been there as soon as we get it.

We can already customise tool tips to show pertinent information about a data point that don’t influence the viz itself. For example, if we’re scatter-plot analysing sales and profit per customer, perhaps we’d like to show whether the customer is a recent customer vs a long term customer in the tool tip when hovered over.

In today’s world, as you hover over a particular customer’s datapoint, the tooltip indeed may tell you that it’s a recent customer. But what’s the pattern in the other datapoints that are also recent customers?

In tomorrow’s world you’ll be able to click where it tells you “recent customer” and all the other “recent customers” in the viz will be highlighted. It’s nothing that you can’t get the same end result today with the use of the highlighter tool, but likely far more convenient in certain situations..

A couple of new web-authoring features, to add to the list on the official blog post.

  1. You can create storypoints on the web
  2. You’ll be able to enable full-screen mode on the web

Legends per measure: this might not sound all that revolutionary, but when you think it through, it enables this sort of classic viz: a highlighted table on multiple measures – where each measure is highlighted independently of the others.

legendpermeasure.PNG
Having average sales of £10000 doesn’t any more have to mean that the high customer age of 100 in the same table is highlighted as though it was tiny.

Yes, there are workarounds to make something that looks similar to the above today – but it’s one of those features that I have found those people yet to be convinced of the merits of Tableau react negatively to when it turns out it’s not a simple operation, after they compare it to other tools (Excel…). Whilst recreating what you made in another tool is often exactly the wrong approach to using a new tool, this type of display is one of the few I see a good case for making easy enough to create.

In the 1-3 year future:

Tableau’s blog does talk about the new super-fast data engine, Hyper, but doesn’t dwell on one cool feature that was demoed on stage.

Creating a Tableau extract is sometimes a slow process. Yes, Hyper should make it faster, but at the end of the day there are factors like remote database performance and network speed that might mean there’s simply no practical way to speed it up.  Today you’re forced to sit and stare at the extract creation process until it’s done.

Hyper, though, can do its extract-making process in the background, and let you use it piece-by-piece, as it becomes available.

So if you’re making an extract of sales from the last 10 years, but so far only the information from the last 5 years has arrived to the extract creation engine, you can already start visualising what happened in the last 5 years. Of course you’ll not be able to see years 6-10 at the moment, as it’s still winging its way to you through the wifi. But you can rest safe in the knowledge that once the rest of the data has arrived it’ll automatically update your charts to show the full 10 year range. No more excuses for long lunches, sorry!

It seems to me that this, and features like incremental refresh, also open the door to enabling near real-time analysis within an extract.

Geographic augmentation – Tableau can plot raw latitude and longitude points with ease. But in practice, they are just x-y points shown over a background display; there’s no analytical concept present that point x,y is part of the state of Texas whereas point y,z is within New York. But there will be. Apparently we will be able to roll up long/lat pairs to geographic components like zip, state, and so on, even when the respective dimension doesn’t appear in the data.

Web authoring – the end goal is apparently that you’ll be able to do pretty much everything you can do publishing-wise in Tableau Desktop on the web. In recent times, each iteration has added more and more features – but in the longer term, the aim is to get to absolute parity.

We were reassured that this doesn’t mean that the desktop product is going away; it’s simply a different avenue of usage, and the two technologies will auto-sync so that you could start authoring on your desktop app, and then log into a website from a different computer and your work will be there waiting for you, without the need to formally  publish it.

It will be interesting to see whether, and how, this affects licensing and pricing as today there is a large price differential between for instance a Tableau Online account and Tableau Desktop Professional, at least in year one.

And finally, some collaboration features on Tableau server.

The big one, for me, is discussions (aka comments).  Right alongside any viz when published will be a discussion pane. The intention is that people will be able to comment, ask questions, explain what’s shown and so on.

But, doesn’t Tableau Server already have this? Well, yes, it does have comments, but in my experience they have not been greatly useful to many people.

The most problematic issue in my view has been the lack of notifications. That is to say,  a few months after publishing a delightful dashboard, a user might have a question about a what they’re seeing and correctly pop a comment on the page displaying the viz. Great.

But the dashboard author, or whichever SME might actually be able to answer the question, isn’t notified in any way.  If they happen to see that someone commented by chance, then great, they can reply (note that the questioner will not be notified that someone left them an answer though). But, unless we mandate everyone in the organisation to manually check comments on every dashboard they have access to every day, that’s rather unlikely to be the case.

And just opening the dashboard up may not even be enough, as today they tend to be displayed “below the fold” for any medium-large sized dashboard. So comments go unanswered, and people get grumpy and stop commenting, or never notice that they can even comment.

The new system however will include @user functionality, which will email the user when a comment or question has been directed at them.  I’m also hoping that you’ll be able to somehow subscribe to dashboards, projects or the server such that you get notified if any comments are left that you’re entitled to see , whether or not you’re mentioned in them.

As they had it on the demo at least, the comments also show on the right hand side of the dashboard rather than below it – which given desktop users tend to have wide rather than tall screens should makes them more visible. They’ll also be present in the mobile app in future.

Furthermore, each time a comment is made, the server will store and show the state of the visualisation at that time, so that future readers can see exactly what the commenter was looking at when they made their comments. This will be great for the very many dashboards that are set up to autorefresh or allow view customisation.

Conversation.PNG

(My future comment wishlist #1: ability to comment on an individual datapoint, and have that comment shown wherever that datapoint is seen).

Lastly, sandboxes. Right now, my personal experience has been that there’s not a huge incentive to publish work-in-progress to a Tableau server in most cases. Depending on your organisation’s security setup, anything you publish might automatically become public before you’re ready, and even if not, then unless you’re pretty careful with individual permissions it can be the case that you accidentally share your file too widely, or not widely enough, and/or end up with a complex network of individually-permissioned files that are easy to get mixed up.

Besides, if you always operate from the same computer, there’s little advantage (outside of backups) to publishing it if you’re not ready for someone else to look at it. But now, with all this clever versioning, recommendy, commenty, data-alerty stuff, it becomes much more interesting to do so.

So, there will apparently be a user sandbox; a private area on the server where each Tableau user can upload and work on their files, safe in the knowledge that what they do there is private – plus they can customise which dashboards, metrics and so on are shown when they enter their sandbox.

But, better yet, team sandboxes! So, in one click, you’ll be able to promote your dashboard-in-progress to a place where just your local analytics team can see it, for instance, and get their comments, feedback and help developing it, without having to fiddle around with setting up pseudo-projects or separate server installations for your team.

Furthermore, there was mention of a team activity newsfeed, so you’ll be able to see what your immediate team members have been up to in the team sandbox since you last took a peek. This should be helpful for raising awareness of what each team member is working on high, further enhancing the possibilities for collaboration and reducing the likelihood of duplicate work.

Finally, it’s mentioned on Tableau’s blogs, but I wanted to extend a huge cheer and many thanks for the forthcoming data driven alerting feature! Lack of this style of alerting and insufficient collaboration features were the two most common complaints I have heard about Tableau Server from people considering the purchase of something that can be decidedly non-trivial in cost. Other vendors have actually gone so far as to sell add-on products to try and add these features to Tableau Server, many of which are no doubt very good -but it’s simply impossible to integrate them into the overall Tableau install as seamlessly as Tableau themselves could do.

Now we’re in 2016, where the average Very Important And Busy Executive feels like they don’t have time to open up a dashboard to see where things stand, it’s a common and obvious feature request to want to be alerted only when there is actually something to worry about – which may then result in opening the dashboard proper to exploring what’s going on. And, I have no doubt, creative analysts are going to find any number of uses to put it to outside of the obvious “let me know if my sales are poor today”.

(My future data driven alert wishlist #1: please give include a trigger to the effect of “if this metric has an unusual value”, meaning to base it on a statistical calculation derived from on historic variance/std dev/ etc. rather than having to put a flat >£xxxx in as criteria).

What people claim to believe: Hillary Clinton edition

Back to political opinion polls today I’m afraid. Yep, the UK’s Brexit is all done and dusted (haha) but now our overseas friends seem to be facing what might be an even more unlikely choice in the grand US presidential election 2016.

Luckily, the pollsters are on hand to guide us through the inner minds and intentions of the voters-to-be. At last glance, it was looking pretty good for a Clinton victory -although, be not complacent ye Democrats, given the lack of success in the field of polling with regards to the afore-mentioned Brexit or perhaps the 2015 General Election here in the UK.

Below is perhaps my favourite most terrifying poll of recent times. It’s a recent poll carried out by the organisation “Public Policy Polling” concerning residents of the state of Florida. As usual, they asked several questions about the respondents’ characteristics and viewpoints, which lets us divide up the responses into those coming from Clinton supporters vs those coming from Trump supporters.

There are many insidious facts one could elucidate here on both sides, but given that at the moment the main polls are very in favour of a Clinton win (but see previous comment re complacency…), let’s pick out some that might hold relevance in a world where Clinton semi-landslides to victory.

Firstly, it shouldn’t particularly matter, but one can’t help but notice that Clinton is of the female persuasion. But, hey, rational voters look at policies, competence, experience or similar attributes, so a basic demographic fact alone doesn’t matter, right?

Wrong: the survey shows that just 69% of all respondents thought that gender didn’t make a difference. And, predictably, twice as many thought that the US would be better off with a male president than those who thought it would be better of with a female president. The effect is notably strongest within Trump supporters, where nearly 20x the proportion of people think the US would be better with a male president than with a female one.

manorwoman

Now, I can imagine some kind of halo effect where it’s hard for people to totally differentiate “my favourite candidate is a man and I can’t imagine having a favourite candidate that is not like him” from “my favourite candidate is a man but the fact he happens to be a man is incidental”.

But that nearly 40% of Trump supporters here claim that generically the president should be a man (implying that if it was Ms Trump vs Mr Clinton, they might vote differently), it seems potentially a stronger signal of inequality than that, especially when compared to the lower bias between Clinton supporters and preferring a woman – which is equally as illogical, but at least has a lower incidence. We can note also a pro-male bias in the “not sure” population too.

Of course we don’t actually have an example of what the US is like when it has a female president, because none of the 43 serving presidents to date have been women.

But we do know part of what Hilary Clinton is already presidentially responsible for apparently. “Coincidentally” (hmm…) her husband was one of the previous 43 male presidents, and apparently the majority of Trump supporters think it’s perfectly right to hold her responsible for his “behaviour”.

Yep, anything he did, for good or bad (which, let’s face it, is probably biased towards the bad for those people who support the opposing party and/or don’t appreciate cheating spouses) is in some sense his wife’s fault, for the Trumpians.

responsible

But if she’s so obviously bad, then why does she actually poll quite well, at the time of writing? Well, of course there can be only one reason. The whole election is a fraud. And given we haven’t actually had the election yet, I guess the allegation must also entail that poll respondents are also lying about their intentions, and/or that all the publishers of polls are equally as corrupt as the electoral system of the US.

rigged

Yes, THREE-QUARTERS of Trump supporters polled here apparently believe that if, as seems quite likely, Clinton wins then it can only be because the election was rigged. The whole democratic process is a sham. The US has fallen prey to semi-visible forces of uber-powerful corruption. We should presumably therefore ignore the result and give Trump the golden throne (to fit inside his golden house). Choice of winner aside, this is a pretty scary indictment on the respect that citizens feel for their own democratic system. This is not to say whether they are right or wrong to feel this way; to us Brits, I think it sometimes seems that in the US money has even greater hold over some theoretically democratic outcomes in the US than it does over here – but that so many have so little regard for the system is surely…a concern.

But wait, it’s not just that she may hypothetically commit electoral fraud in the near future. She has apparently already committed crimes serious enough that she should already be locked up in prison.

prison

Over EIGHTY PERCENT of Trump supporters polled here think she should literally go to prison; and this isn’t predicated on her winning. Well, there’s no shortage of bad things that can be laid at her door I’m sure, she has after all been serving at a high level of politics for a while already and, without being an expert, it seems like there are many serious allegations that people lay at the Clintons’ feet. But it’s perhaps quite surprising that the large majority of her opponent’s supporters want to throw someone who is likely to be their next president in jail. I don’t think even the Blair war-crimes movement ever got quite that far!

Unless…well. I’m only sad they didn’t ask the same question about Trump. Perhaps we could be more at ease if at least the same proportion of people thought he should be locked up. An oft overlooked fact is that analysis is often meaningless without some sort of carefully-chosen comparison. Perhaps there’s a baseline figure of people that think any given prominent politician should be jailed (but I’ve not seen research on that).

It’s hard to imagine though that the fact Trump has himself actually appeared to threaten her with jail doesn’t play some role here with his supporters though. It is apparently unprecedented for a major party nominee to have said publicly that his opponent should be jailed – but say it he did, most famously during their second presidential debate. As the Guardian reports:

Trump, embracing the spirit of the “lock her up” mob chants at his rallies, threatened: “If I win I am going to instruct my attorney general to get a special prosecutor to look into your situation – there has never been so many lies and so much deception,” he threatened.

Clinton said it was “awfully good” that someone with the temperament of Trump was not in charge of the law in the country, provoking another Trump jab: “Because you’d be in jail.”

Eric Holder, who once was the US attorney general, didn’t really seem to like that plan.

So we’ve established that in the eyes of the average Florida Trump supporter polled here that if Clinton wins then the whole shebang was fraudulent, she already should have been locked up in prison, and, besides, the fact that she’s a women should probably ban her from applying to the office of the president in the first place. That’s a strong indictment. But, of course, there’s another level to explore.

Is Hillary Clinton a malevolent paranormal entity, intent on destroying humankind?

demon

Erm…2 out of every 5 Trump supporters here think yes, she definitely is an actual demon. And the majority aren’t sure that she is not an actual demon.

Even only just over 50% of the “not sure” supporters are also sure she’s not an actual demon. It’s also entertaining to contemplate the c. 10% of her supports that think she might be demonic yet still fancy her as president.

The lower figures might be down to some variant of the excellent StarSlateCodex’s concept of the “Lizardman’s Constant” which can perhaps be summed up as there’s a lower bound % of people who will believe, or claim to believe, any polled sentiment.

But there they benchmark that at around 4%, and ten times that proportion of Trump supporters here respond that they are certain that Clinton is a literal demon. There are many ways to introduce biases that lead to this sort of result, which StarSlateCodex does go over. But 40% is…big…if this poll is even remotely respectable.

So, where has this idea that she’s a demon come from? Have Trump supporters as a collective seen some special evidence that proves this must be true, that somehow the rest of us have overlooked? Surely each individual doesn’t randomly become subject to these thoughts which even believers would probably term an unusual state of affairs -is there no smoke without fire? (pun intended)

Well, perhaps it has something to do with a subset of famous-enough people have stated that she is.

Trump himself did refer to her as a devil, although in fairness that just maybe possibly might be an unfortunate turn of phrase, if we want to be charitable. After all, to his credit, evidence suggests he’s not great at following a script (or at least not one you’d imagine a typical political spinner would write).

Perhaps more pertinently, for certain a certain subsection of viewers anyway, is presenter Alex Jones of “Infowars” fame (a website that apparently gets more monthly visitors than e.g. the Economist or Newsweek), he who Trump says of “your reputation is amazing…I will not let you down”, who did go on a bit of a rant on this subject.

MediaMatters have kindly transcribed:

She is an abject, psychopathic, demon from Hell that as soon as she gets into power is going to try to destroy the planet. I’m sure of that, and people around her say she’s so dark now, and so evil, and so possessed that they are having nightmares, they’re freaking out… I mean this woman is dangerous, ladies and gentleman. I’m telling you, she is a demon. This is Biblical.

There’s so much more if you’re into that sort of stuff; see it all on this video, including the physical evidence he presents of Clinton’s demonness (spoiler alert: she smells bad, and Obama is obviously one too because sometimes flies land on him).

Unfortunately I’m not aware of time series data on perception of Clinton’s level of demonicness – so I’m afraid there’s no temporal analysis to present on causal factors here.

At first glance some of this might seem kind of amusing in a macabre way – especially to us foreigners for whom the local political process is hugely less pleasant or equitable than it should be, but it doesn’t usually come with claims of supernatural possession. But the outcome may not be so funny. In the likely (but not certain) event that Clinton wins, Florida at least seems to have a significant bunch of people who think the whole debacle was rigged, and Clinton should have a gender change, an exorcism and a long spell in jail before even being considered for for the presidency.

Update 1: this sort of stuff probably doesn’t help matters – from former Congressmen / Radio host Joe Walsh:

Update 2: the polls are a lot closer now then they were when I started writing.

Do good and bad viz choices exist?

Browsing the wonderful timeline of Twitter one evening, I noted an interesting discussion on subjects including Tableau Public, best practice, chart choices and dataviz critique. It’s perhaps too long to go into here, but this tweet from Chris Love caught my eye.

Not being particularly auspicious with regards to summarising my thoughts into 140 characters, I wanted to explore some thoughts around the subject here. Overall, I would concur with the sentiment as expressed – particularly when it had to be crammed into such a small space, and taken out of context as I have here 🙂

But, to take the first premise, whilst there are probably no viz types that are inherently terrible or universally awesome, I think one can argue that there are good or bad viz choices in many situations. It might be the case in some instances that there’s no best or worst viz choice (although I think we may find that there often is, at least out of the limited selection most people are inclined to use). Here I am imagining something akin to a data-viz version of Harris’ “moral landscape“; it may not be clear what the best chart is, but there will be local maximums that are unquestionably better for purpose than some surrounding valleys.

So, how do we decide what the best, or at least a good, viz choice is? Well, it surely comes down to intention. What is the aim of the author?

This is not necessarily self-evident, although I would suggest defaulting to something like “clearly communicating an interesting insight based on an amalgamation of datapoints” as a common one. But there are others:

  • providing a mechanism to allow end-users to explore large datasets which may or may not contain insights,
  • providing propaganda to back up an argument,
  • or selling a lot of books or artwork

to name a few.

The reason we need to understand the intention is because that should be the measure of whether the viz is good or bad.

Imagine my aim is to communicate that 10% of my customers are so unprofitable that we would be better off without them to an audience of ten may-as-well-be-clones business managers – note that the details of the audience is very important here too.

I’ll go away and draw 2 different visualisations of the same data (perhaps a bar chart and, hey, why not, a 3-d hexmap radial chart 🙂 ). I’ll then give version 1 to five of the managers, and version 2 to the other five. Half an hour later, I’ll quiz them on what they learned . Simplistically, I shall feel satisfied that whichever one of them generated the correct understanding in the most managers was the better viz in this instance.

Yes yes, this isn’t a perfect double-blind controlled experiment, but hopefully the point is apparent. “Proper” formal research on optimising data visualisation is certainly done, and very necessary it is too. There’s far too many examples to list, but classics in the field might include the paper “Graphical Perception” by Cleveland and McGill, which helped us understand which types of charts were conducive to being visually decoded accurately by us humans and our built-in limitations.

Commercially, companies like IBM or Autodesk or Google have research departments tackling related questions. In academia, there’s groups like the University of Washington Interactive Data Lab (which, interestingly enough, started out as the Stanford Vizualisation Group whose work on “Polaris” was later released commercially as none other than Tableau software).

If you’re looking for ideas to contribute to on this front, Stephen Few maintains a list of some research he’d like to see done on the subject in future, and no doubt there are infinitely many more possibilities if none of those pique your curiosity.

But the point is: for certain given aims, it is often possible to use experimental procedures and the resulting data, to say, as surely as we can say many things, visualisation A is better than visualisation B at achieving its aim.

But not go too far in expressing certainty here! There are several things to note, all contributing to the fact that very often there is not one best viz for a single dataset – context is key.

  • What is the aim of the viz? We covered that one already. Using a set of attractive colours may be more important than correct labelling on axes if you’re wanting to sell a poster for instance. Certain types of chart make for easier and more accurate types of particular comparisons than others. If you’re trying to learn or teach how to create a particular type of uber-creative chart in a certain tool, then you’re going to rather fail to accomplish that if you end up making a bar chart.
  • Who is the audience? For example, some charts can convey a lot of information is a small space; for instance box-and-whisker plots. An analyst or statistician will probably very happily receive these plots to understand and compare distributions and other descriptive stats in the right circumstances. I love them.However, extensive experience tells me that, no, the average person in the street does not. They are far less intuitive than bar or line charts to the non-analytically inclined/trained. However inefficient you might regard it, a table and 3 histograms might communicate the insight to them more successfully than a boxplot would. If they show an interest, by all means take the time to explain how to read a box plot; extol the virtues of the data-based lifestyle we all know; rejoice in being able to teach a fellow human a useful new piece of knowledge. But, in reality, your short-term job is more likely to be to communicate an important insight rather than provide an A-level statistics course – and if you don’t do well at fulfilling what you’re being employed to do, then you might not be employed to do it for all that long.

As well as there being no single best viz type in a generic sense, there’s also no one universally worst viz type. If there was, the datarati would just ban it. Which, I guess, some people are inclined to do – but, sorry, pie charts still exist. And they’re still at least “locally-good” in some contexts – like this one (source: everywhere on the internet):

pie

But, hey, you don’t have the time to run multiple experiments on multiple audiences. Let’s imagine you also are quite new to the game, with very little personal experience. How would you know which viz type to pick? Well, this is going to be a pretty boring answer sorry – and there’s more to elaborate on later, but, one way relates to the fact that, just like in any other field there, are actually “experts” in data viz. And outside of Michael Gove’s deluded rants, we should acknowledge they usually have some value.

In 1928, Bertrand Russell wrote an essay called ‘On the Value of Scepticism‘, where he laid out 3 guidelines for life in general.

 (1) that when the experts are agreed, the opposite opinion cannot be held to be certain;

(2) that when they are not agreed, no opinion can be regarded as certain by a non-expert;

and (3) that when they all hold that no sufficient grounds for a positive opinion exist, the ordinary man would do well to suspend his judgment.

So, we can bastardise these a bit to give it a dataviz context. If you’re really unsure of what viz to pick, then refer to some set of experts (to which we must acknowledge there’s subjectivity in picking…perhaps more on this in future).

If “experts” mostly think that data of type D used to convey an insight of type I to an audience of type A for purpose P is best represented in a line chart, then that’s probably the way to go if you don’t have substantial reason to believe otherwise. Russell would say that at least you can’t be held as being “certainly wrong” in your decision, even if your boss complains. Likewise, if there’s honestly no concurrence in opinion, then, have a go and take your pick of the suggestions – again, no-one should tell you off for because you did something unquestionably wrong!

For example, my bias is towards feeling that, when communicating “standard” insights efficiently via charts to a literate but non-expert audience, you can’t go too far wrong in reading some of Stephen Few’s books. Harsh and austere they may seem at times, but I believe them to be based on quality research in fields such as human perception as well as experience in the field.

But that’s not to say that his well founded, well presented guidelines, are always right. Just because 90% of the time you might be most successful in representing a certain type of time series as a line chart doesn’t mean that you always will be. Remember also, you may have a totally different aim to the audience to whom Mr Few aims his books at, in which case you cannot assume at all that the same best-practice standards would apply.

And, despite the above guidelines, because (amongst other reasons) not all possible information is ever available to us at any given time, sometimes experts are simply wrong. It turns out that the earth probably isn’t the centre of the universe, despite what you’d probably hear if you went back to experts from a millennia ago. You should just take care to find some decent reason to doubt the prevailing expertise, rather than simply ignoring it.

What we deem as the relative “goodness” of data viz techniques is also surely not static over time. For one, not all forms of data visualisation have existed since the dawn of mankind.

The aforementioned box and whisker plot is held to have been invented by John Tukey. He was only born in 1915, so if I were to travel back 200 years in time with my perfectly presented plot, then it’s unlikely I’d find many people to who find it intuitive to interpret. Hence, if my aim was to be to communicate insights quickly and clearly, then on the balance of probabilities this would probably be a bad attempt. It may not be the worst attempt, as the concept is still valid and hence could likely be explained to some inhabitants of the time – but in terms of bang for buck, there’d be no doubt be higher peaks in the “communicating data insights quickly” landscape available to me nearby.

We should also remember that time hasn’t stopped. Contrary to Francis Fukuyama’s famous essay and book, we probably haven’t reached the end of history even politically just yet, and we most certainly haven’t done so in the world of data. Given the rate of usable data creation, it might be that we’ve only dipped our toe in so far. So, what we think is best practice today may likely not be the same a hundred years hence; some of it may not be so even next year.

Some, but not all, obstacles or opportunities surround technology. Already the world has moved very quickly from graph paper, to desktop PCs, to people carrying around super-computers that only have small screens in their pockets. The most effective, most efficient, ways to communicate data insights will differ in each case. As an example I’m very familiar with, the  Tableau software application, clearly acknowledged this in their last release which includes facilities for displaying data differently depending on what device they’re been viewed on. Not that we need to throw the baby out with the bathwater, but even our hero Mr Tukey may not have had the iPhone 7 in mind when considering optimum data presentation.

Smartwatches have also appeared, albeit are not so mainstream at the moment. How do you communicate data stories when you have literally an inch of screen to play with? Is it possible? Almost certainly so, but probably not in the same way as on a 32 inch screen; and are the personal characteristics and needs of smart watch users anyway the same as the audience who views vizzes on a larger screen?

And what if Amazon (Echo), Google (Home) and others are right to think that in the future a substantial amount of our information based interactions may be done verbally, to a box that sits on the kitchen counter and doesn’t even have a screen? What does “data visualisation” mean in this context? Is it even a thing? But a lot of the questions I might want to ask my future good friend Alexa might well be questions that can only answered by some transformation and re-presentation in audio form of data.

I already can verbally ask my phone to provide me some forms of dataviz. In the below example, it shows me a chart and a summary table. It also provides me a very brief audio summary for the occasions where I can’t view the screen, shown in the bold text above the chart. But, I can’t say I’ve heard of a huge amount of discussion about how to optimise the audio part of the “viz” for insight. Perhaps there should be.

image

Technology aside though, the field should not rest on its laurels; the line chart may or may not ever die, but experimentation and new ideas should always be welcomed. I’d argue that we may be able to prove  in many cases that, today, for a given audience, for a given aim, with a given dataset, out of the various visualisations we most commonly have access to, that one is demonstrably better than another, and that we can back that up via the scientific method.

But what if there’s an even better one out there we never even thought of? What if there is some form of time series that is best visualised in a pie chart? OK, it may seem pretty unlikely but, as per other fields of scientific endeavour, we shouldn’t stop people testing their hypotheses – as long as they remain ethical – or the march of progress may be severely hampered.

Plus, we might all be out of a job. If we fall into the trap of thinking the best of our knowledge today is the best of all knowledge that will ever be available, that the haphazard messy inefficiencies of creativity are a distraction from the proven-efficient execution of the task at hand, then it’ll not be too long before a lot of the typical role of a basic data analyst is swallowed up in the impending march of our robotic overlords.

Remember, a key job of a lot of data-people is really to answer important questions, not to draw charts. You do the second in order to facilitate the first, but your personal approach to insight generation is often in actuality a means to another end.

Your customer wants to know “in what month were my sales highest?”. And, lo and behold, when I open a spreadsheet in the sort of technology that many people treat as the norm these days, Google sheets, I find that I can simply type or speak in the question “What month were my sales highest?” and it tells me very clearly, for free, immediately, without employing anyone to do anything or waiting for someone to get back from their holiday.

capture

Yes, that feature only copes with pretty simplistic analysis at the moment, and you have to be careful how you phrase your questions – but the results are only going to get better over time, and spread into more and more products. Microsoft PowerBI already has a basic natural language feature, and Tableau is at a minimum researching into it. Just wait until this is all hooked up to the various technological “cognitive services” which are already on offer in some form or other. A reliable, auto-generated answer to “what will my sales be next week if I launch a new product category today?” may free up a few more people to spend time with their family, euphemistically or otherwise.

So in the name of progress, we can and should, per Chris’ original tweet, be open to giving and receiving constructive criticism, whether positive or negative. There is value in this, even in the unlikely event that we have already hit on the single best, universal, way of of representing a particular dataset for all time.

Recall John Stuart Mill’s famous essay, “On Liberty” (written in 1869, yes, even before the boxplot existed). It’s so very quotable for many parts of life, but let’s take for example a paragraph from chapter two, regarding the “liberty of thought and discussion”. Why shouldn’t we ban opinions, even when we believe we know them to be bad opinions?

But the peculiar evil of silencing the expression of an opinion is, that it is robbing the human race; posterity as well as the existing generation; those who dissent from the opinion, still more than those who hold it.

If the opinion is right, they are deprived of the opportunity of exchanging error for truth: if wrong, they lose, what is almost as great a benefit, the clearer perception and livelier impression of truth, produced by its collision with error.

Are pie charts good for a specific combination of time series data, audience and aim?

Well – assuming a particularly charitable view of human discourse –  after rational discussion we will either establish that yes, they actually are, in which case the naysayers can “exchange error for truth” to the benefit of our entire field.

Or, if the consensus view of “no way” holds strong, then, having been tested, we will have reinforced the reason why this is in both the minds of the questioner, and ourselves – hence helping us remember the good reasons why we hold our opinions, and ensuring we never lapse into the depths of pseudo-religious dogma.

Remember the exciting new features Tableau demoed at #data15 – have we got them yet?

As we get closer towards the thrills of this year’s Tableau Conference (#data16), I wanted to look back at one of the most fun parts of the last year’s conference – the “devs on stage” section. That’s the part where Tableau employees announce and demonstrate some of the new features that they’re working on. No guarantees are made as to whether they’ll ever see the light of day, let alone be in the next release –  but, in reality, the audience gets excited enough that there’d probably be a riot if none of them ever turned up.

Having made some notes of what was shown in last year’s conference (which was imaginatively entitled #data15), I decided to review the list and see how many of those features have turned up so far. After all, it’s all very well to announce fun new stuff to a crowd of 10,000 over-excited analysts…but does Tableau tend to follow through on it? Let’s check!

(Please bear in mind that these are just the features I found significant enough to scrawl down through the jet-lag; it’s not necessarily a comprehensive review of what was on show.)

Improvements in the Data category:

Feature Does it exist yet?
Improvements to the automatic data cleanup feature recently released that can import Excel type files that are formatted in an otherwise painful way for analysis Yes – Tableau 9.2 brought features like “sub-table detection” to its data interpreter feature
Can now understand hundreds of different date formats Hmm…I’m not sure.  I’ve not had any problems with dates, but then again I was lucky enough never to have many!
The Data Source screen will now allow Tableau to natively “union” data (as in SQL UNION), as well as join it, just by clicking and dragging. Yes – Tableau 9.3 allows drag and drop unioning. But only on Excel and text files. Here’s hoping they expand the scope of that to databases in the future.
Cross-database joins Yes, cross-database joins are in Tableau 10.

Improvements in the Visualisation category:

Feature Does it exist yet?
Enhancements to the text table visualisation Yes – Tableau 9.2 brought the ability to show totals at the top of columns, and 9.3 allowed excluding totals from colour-coding.
Data highlighter Yes – Tableau 10 includes the highlighter feature.
New native geospatial geographies Yes – 9.2 and 9.3 both added or updated some geographies.
A connector to allow connection to spatial data files No – I don’t think I’ve seen this one anywhere.
Custom geographic territory creation Yes – Tableau 10 has a couple of methods to let you do that.
Integration with Mapbox Yes- Tableau 9.2 lets you use Mapbox maps.
Tooltips can now contain worksheets themselves. No – not seen this yet.

Improvements in the Analysis category:

Feature Does it exist yet?
Automatic outlier detection No
Automatic cluster detection Yes, that’s a new Tableau 10 feature
You can “use” reference lines / bands now for things beyond just static display Hmm…I don’t recall seeing any changes in this area. No?

Improvements in the Self-Service category:

Feature Does it exist yet?
There will be a custom server homepage for each user Not sure – the look and feel of the home page has changed, and the user can mark favourites etc. but I have not noticed huge changes in customisation from previous versions.
There will be analytics on the workbooks themselves  Yes – Tableau 9.3 brought content analytics to workbooks on server.Some metadata is shown in the content lists directly, plus you can sort by view count.
Searching will become better Yes – also came with Tableau 9.3. Search shows you the most popular results first, with indicators as to usage.
Version control Yes – Tableau 9.3 brought workbook revision history for server, and Tableau 10 enhanced it.
Improvements to security UI Yes – not 100% sure which version, but the security UI changed. New features were also added, such as setting and locking project permissions in 9.2.
A web interface for managing the Tableau server Not sure about this one, but I don’t recall seeing it anywhere. I’d venture “no”, but am open to correction!

Improvements in the Dashboarding category:

Feature Does it exist yet?
Improvements to web editing Yes – most versions of Tableau since then have brought improvements here. In Tableau 10 you can create complete dashboards from scratch via the web.
Global formatting  Yes, this came in Tableau 10.
Cross datasource filtering Yes, this super-popular feature also came with Tableau 10.
Device preview Yes, this is available in Tableau 10.
Device specific dashboards. Yes, also from Tableau 10.

Improvements in the Mobile category:

Feature Does it exist yet?
A  Tableau iPhone app Yes – download it here. An Android app was also released recently.
 iPad app – Vizable Was actually launched at #data15, so yes, it’s here.

Summary

Hey, a decent result! Most of the features demonstrated last year are already in the latest official release.

And for some of those that aren’t, such as outlier detection, it feels like a framework has been put in place for the possible later integration of them. In that particular case, you can imagine it being located in the same place, and working in the same way, as the already-released clustering function.

There are perhaps a couple that it’s slightly sad to see haven’t made it just yet – I’m mainly thinking of embedded vizzes in tooltips here. From the celebratory cheers, that was pretty popular with the assembled crowds when demoed in 2015, so it’ll be interesting to see whether any mention of development on that front is noted in this year’s talks.

There are also some features released that I’d like to see grow in scope – the union feature would be the obvious one for me. I’d love to see the ability to easily union database tables beyond Excel/text sources. And now we have cross-database joins, perhaps even unioning between different technology stacks.

Bonus points due: In my 2015 notes, I had mentioned that a feature I had heard a lot of colleague-interest in, that was not mentioned at all in the keynote, was data driven alerting; the ability to be notified only if your KPI goes wild for instance. Sales managers might get bored of checking their dashboards each day just to see if sales were down when 95% of the time everything is fine, so why not just send them an email when that event actually occurs?

Well, the exciting news on that front is that some steps towards that have been announced for Tableau 10.1, which is in beta now so will surely be released quite soon.

Described as “conditional subscriptions”, the feature will allow you to “receive email updates when data is present in your viz”. That’s perhaps a slight abstraction from the most obvious form of data-driven alerting. But it’s easy to see that, with a bit of thought, analysts will be able to build vizzes that give exactly the sort of alerting functionality my colleagues, and many many others in the wider world, have been asking for. Thanks for that, developer heroes!

 

Help decide who self-driving cars should kill

Automated self-driving cars are surely on their way. Given the direction of technological development, this seems a safe enough prediction to make – at least when taking the coward’s option of not specifying a time frame.

A self-driving car is, after all, a data processor, and we like to think that we’re getting better at dealing with data every day. Simplistically, in such a car sensors provide some data (e.g. “there is a pedestrian in front of the car”), some automated decision-making module comes up with an intervention (“best stop the car”), and a process is carried out to enact that decision (“put the brakes on”).

Here for example is a visualisation of what a test Google automated car “sees”.

Capture.PNG

My hope and expectation is that, when they have reached a sophisticated enough level of operation and are at a certain threshold of prevalence, road travel will become safer.

Today’s road travel is not super-safe. According to the Association for Safe International Road Travel, around 1.3 million people die in road crashes each year – and 20-50 million more are injured or disabled. It’s the single leading cause of death amongst some younger demographics.

Perhaps automated vehicles could save some of these lives, and prevent many of the serious injuries. After all, a few years ago, The Royal Society for the Prevention of Accidents claimed that 95% of road accidents involve some human error, and 76% were solely due to human factors. There is a lot at stake here. And of course there are many more positive impacts (as well as some potential negatives) one might expect from this sort of automation beyond direct life-saving, which we’ll not go into here.

At this moment in time, humanity is getting closer to developing self-driving cars; perhaps surprisingly close to anyone who does not follow the topic. Certainly we do not have any totally automated car capable of (or authorised to be) driving every road safely at the moment, and that will probably remain true for a while yet. But, piece by piece, some manufacturers are automating at least some of the traditionally human aspects of driving, and several undoubtedly have their sights on full automation one day.

Some examples:

Landrover are shortly to be testing semi-autonomous cars that can communicate with other such cars around them.

The test fleet will be able to recognise cones and barriers using a forward facing 3D-scanning camera; brake automatically when it senses a potential collision in a traffic jam; talk to each other via radio signals and warn of upcoming hazards; and know when an ambulance, police car, or fire engine is approaching.

BMW already sells a suite of “driver assistance” features on some cars, including what they term intelligent parking, intelligent driving and intelligent vision. For people with my driving skill level (I’m not one of the statistically improbable 80% of people who think they are above average drivers), clearly the parking assistant is the most exciting: it both finds a space that your car would actually fit into, and then does the tricky parallel or perpendicular parking steering for you. Here it is in action:

Nissan are developing a “ProPilot” featuring, which also aims to help you drive safely, change lanes automatically, navigate crossroads and park.

Tesla are have probably the most famous “autopilot” system available right now. This includes features that will automatically keep your car in lane at a sensible speed, change lanes safely for you, alert the driver to unexpected dangers and park the car neatly for you. This is likely most of what you need for full automation for some simpler trips, although they are clear its a beta feature and that it is important you keep your hands on the steering wheel and remain observant when using it. Presumably preempting our inbuilt tendency towards laziness, it even goes so far as to sense when you haven’t touched the wheel for a while and tells you to concentrate; eventually coming to a stop if it can’t tell you’re still alive and engaged.

Here’s a couple of people totally disobeying the instructions, and hence nicely displaying its features.

And here’s how to auto-park a Tesla:

 

Uber seems particularly confident (when do they not?). Earlier this month, the Guardian reported that:

Uber passengers in Pittsburgh will be able to hail self-driving cars for the first time within the next few weeks as the taxi firm tests its future vision of transportation in the city. The company said on Thursday that an unspecified number of autonomous Ford Fusions will be available to pick up passengers as with normal Uber vehicles. The cars won’t exactly be driverless – they will have human drivers as backup – but they are the next step towards a fully automated fleet.

uber.jpg

 

And of course Google have been developing a fully self-driving car for a few years now. Here’s a cheesy PR video to show their fun little pods in action.

But no matter how advanced these vehicles get, road accidents will inevitably happen.

In recent times there has been a fatality famously associated with the Tesla autopilot – although as Tesla are obviously at pains to point out, one should remember that it is technically a product in beta and they are clear that you should always concentrate on the road and be ready to take over manually; so this accident might, at best, be attributed to a mix of the autopilot and the human in reality.

However, there will always be some set of circumstances or seemingly unlikely event that neither human or computer would be able to handle without someone getting injured or killed. Computers can’t beat physics, and if another car is heading up your one-way road, which happens to have a brick wall on one side and a high cliff on the other side, at 100 mph then some sort of bad incident is going to happen. The new question we have to ask ourselves in the era of automation is: exactly what incident should that be?

This obviously isn’t actually a new question. In the uncountable number of human-driven road incidents requiring some degree of driver intervention to avoid danger that happen each day, a human is deciding what to do. We just don’t codify it so formally. We don’t sit around planning it out in advance.

In the contrived scenario I described above, where you’re between a wall and a cliff with an oncoming car you can’t get around, perhaps you instinctively know what you’d do. Or perhaps you don’t – but if you are unfortunate enough to have it happen to you, you’ll likely do something. This may or may not the same action as you’d rationally pick beforehand, given the scenario. We rely on a mixture of human instinct, driver training and reflexes to handle these situations, implicitly accepting that the price of over a million deaths a year is worth paying to be able to undergo road travel.

So imagine you’re the programmer of the automated car. Perhaps you believe you might eliminate just half of those deaths if you do your job correctly; which would of course be an awesome achievement. But the car still needs to know what to do if it finds itself between a rock and a hard place. How should it decide? In reality, this is obviously complicated far further insomuch as there are a near-infinite number of scenarios in reality and no-one can explicitly program for each one (hence the need for data-sciencey techniques to learn from experience rather than simple “if X then Y” code). But, simplistically, what “morals” should your car be programmed with when it comes to potentially deadly accidents?

  • Should it always try and save the driver? (akin to a human driver’s instinct for self-preservation, if that’s what you believe we have.)
  • Or should it concentrate on saving any passengers in the same car as the driver?
  • How about the other car driver involved?
  • Or any nearby, unrelated, pedestrians?
  • Or the cute puppy innocently strolling along this wall-cliff precipice?
  • Does it make a difference if the car is explicitly taking an option (“steer left and ram into the car on the opposite side of the road”) vs passively continuing to do what it is doing (“do nothing which will result in you hitting the pedestrian standing in front of the wall”).
    • You might think this isn’t a rational factor, but anyone who has studied the famous “trolley problem” thought experiment will realise people can be quite squeamish about this. In fact, this whole debate boils down to some extent as being a realisation of that very thought experiment.
  • Does it make a difference how many people are involved? Hitting a group of 4 pedestrians vs a car that has 1 occupant? Or vice versa?
  • What about interactions with probabilities? Often you can’t be 100% sure that an accident will result in a death. What if the choice is between a 90% chance of killing 1 person or a 45% chance of killing two people?
  • Does it make a difference what the people are doing? Perhaps the driver is ignoring the speed limit, or pedestrians are jaywalking somewhere they shouldn’t. Does that change anything?
  • Does it even perhaps make a difference as to who the people involved are? Are some people more important to save than others?

Well, the MIT Media Lab is now giving you the opportunity to feed into those sorts of decisions, via its Moral Machine website.

To quote:

From self-driving cars on public roads to self-piloting reusable rockets landing on self-sailing ships, machine intelligence is supporting or entirely taking over ever more complex human activities at an ever increasing pace. The greater autonomy given machine intelligence in these roles can result in situations where they have to make autonomous choices involving human life and limb. This calls for not just a clearer understanding of how humans make such choices, but also a clearer understanding of how humans perceive machine intelligence making such choices.

Effectively, they are crowd-sourcing life-and-death ethics. This is not to say that any car manufacturer will necessarily take the results into account, but at least they may learn what the responding humans (which we must note is far from a random sample of humanity) think they should do, and the level of certainty we feel about it.

Once you arrive, you’ll be presented with several scenarios, and asked what you think the car should do in that scenario. There will always be some death involved (although not always human death!). It’ll also give you a textual description of who and what is happening. It’s then up to you to pick out of the two options given which the car should do.

Here’s an example:

car_ethics.PNG

You see there that a child is crossing the road, although the walk signal is on red, so they should really have waited. The car can choose to hit the child who will then die, or it can choose to ram itself into an inconvenient obstacle whereby the child will live, but the driver will die. What should it do?

You get the picture; click through a bunch on those and not only does MIT gather a sense of humanity’s moral data on these issues, but you get to compare yourself to other respondents on axes such as “saving more lives”, “upholding the law” and so on. You’ll also find out if you have implied gender, age or “social value” preferences in who you choose to kill with your decisions.

This comparison report isn’t going to be overly scientific on an individual level (you only have a few scenarios to choose from apart from anything else) but it may be thought-provoking.

After all, networked cars of the future may well be able to consult the internet and use facts it finds there to aid decisions. A simple extension of Facebook’s ability to face-recognise you in your friends’ photos could theoretically lead to input variables in these decisions like “Hey, this guy only has 5 twitter friends, he’ll be less missed than this other one who has 5000!” or  “Hey, this lady has a particularly high Klout score (remember those?) so we should definitely save her!”.

You don’t think we’d be so callous as to allow the production of a score regarding “who should live?”. Well, firstly, we have to. Having the car kill someone by not changing its direction or speed, when the option is there that it could do so, is still a life-and-death decision, even if it results in no new action.

Plus we already do use scores in domains that infer mortality. Perhaps stretching the comparison to its limits, here’s one example (and please do not take it that I necessarily approve or disapprove of its use, that’s a story for another day – it’s just the first one that leaps to mind).

The National Institute for Health and Care Excellence (NICE) provides guidance to the UK National Health Service on how to improve healthcare. The NHS, nationalised as it is (for the moment…beware our Government’s slow massacre of it though), still exists within the framework of capitalism and is held to account on sticking to a budget. It has to buy medicines from private companies and it can only afford so many. This implies that not everyone can have every treatment on the market. So how does it decide what treatments should be offered to who?

Under this framework, we can’t simply go on “give whatever is most likely to save this person’s life” because some of the best treatments may cost so much that giving it to 10 people, of which 90% will probably be cured, might mean that another 100 people who could have been treated at an 80% success rate will die, because there was no money left for the cheaper treatment.

So how does it work? Well, to over-simplify, they have famously used a data-driven process involving a Quality-adjusted life year (QALYS) metric.

A measure of the state of health of a person or group in which the benefits, in terms of length of life, are adjusted to reflect the quality of life. One QALY is equal to 1 year of life in perfect health.

QALYs are calculated by estimating the years of life remaining for a patient following a particular treatment or intervention and weighting each year with a quality-of-life score (on a 0 to 1 scale). It is often measured in terms of the person’s ability to carry out the activities of daily life, and freedom from pain and mental disturbance.

At least until a few years ago, they had guidelines that an intervention that cost the NHS less that £20k per QALY gained was deemed cost effective. It’s vital to note that this “cost effectiveness” was not the only factor that feeds into whether the treatment should be offered or not, but it was one such factor.

This seemingly quite emotionless method of measurement sits ill with many people: how can you value life in money? Isn’t there a risk that it penalises older people? How do you evaluate “quality”? There are many potential debates, both philosophical and practical.

But if this measure isn’t to be used, then how should we decide how to divide up a limited number of resources when there’s not enough for everyone, and those who don’t get them may suffer, even die?

Likewise, if an automated car cannot keep everyone safe, just as a human-driven car has never been able to, then on which measure involving which data should we base the decision as to who to save on?

But even if we can settle on a consensus answer to that, and technology magically improves to the point where implementing it reliably is childsplay, actually getting these vehicles onto the road en masse is not likely to be simple. Yes, time to blame humans again.

Studies have already looked at the sort of questions that the Moral Machine website poses you. “The Social Dilemma of Autonomous Vehicles” by Bonnefan et al is a paper, published in the journal Science, in which the researchers ran their own surveys as to what people thought these cars should be programmed to do in terms of the balance between specifically protecting the driver vs minimising the total number of causalities, which may include other drivers, pedestrians, and so on.

In general respondents fitted what the researchers termed a utilitarian mindset: minimise the number of casualties overall, no need to try and save the driver at all costs.

In Study 1 (n = 182), 76% of participants thought that it would be more moral for AVs to sacrifice one passenger, rather than kill ten pedestrians (with a 95% confidence interval of 69—82). These same participants were later asked to rate which was the most moral way to program AVs, on a scale from 0 (protect the passenger at all costs) to 100 (minimize the number of casualties). They overwhelmingly expressed a moral preference for utilitarian AVs programmed to minimize the number of casualties (median = 85, Fig. 2a).

(This is also reflected in the results of the Moral Machine website at the time of writing.)

Horray for the driving public; selfless to the last, every life matters, etc. etc. Or does it?

Well, later on, the survey tackled questions around, not only what should these vehicles do in emergencies, but how comfortable would they personally be if vehicles did behave that way, and lastly, how likely would they be to buy one that exhibited that behaviour?

Of course, even in thought experiments, bad things seem worse if they’re likely to happen to you or those you love.

even though participants still agreed that utilitarian AVs were the most moral, they preferred the selfprotective model for themselves.

Once more, it appears that people praise utilitarian, self-sacrificing AVs, and welcome them on the road, without actually wanting to buy one for themselves.

Humans, at least in that study, appear have a fairly high consensus that minimising causalities is key in these decisions. But we also have a predictable tendency to be the sort of freeloaders that prefer for everybody else to follow a net-safety-promoting policy, as long as we don’t have to ourselves. This would seem to be a problem that it’s unlikely even the highest quality data or most advanced algorithm will solve for us at present.

The Tableau #MakeoverMonday doesn’t need to be complicated

For a while, a couple of  key members of the insatiably effervescent Tableau community, Andy Cotgreave and Andy Kriebel, have been running a “Makeover Monday” activity. Read more and get involved here – but a simplistic summary would be that they distribute a nicely processed dataset on a topic of the day that relates to someone else’s existing visualisation, and all the rest of us Tableau fans can have a go at making our own chart, dashboard or similar to share back with the community so we can inspire and learn from each other.

It’s a great idea, and generates a whole bunch of interesting entries each week. But Andy K noticed that each Monday’s dataset was getting way more downloads than the number of charts later uploaded, and opened a discussion as to why.

There are of course many possible reasons, but one that came through strongly was that, whilst they were interested in the principle, people didn’t think they had the time to produce something comparable to some of the masterpieces that frequent the submissions. That’s a sentiment I wholeheartedly agree with, and, in retrospect – albeit subconsciously – why I never gave it a go myself.

Chris Love, someone who likely interacts with far more Tableau users than most of us do, makes the same point in his post on the benefits of Keeping It Simple Stupid. I believe it was written before the current MakeoverMonday discussions began in earnest, but was certainly very prescient in its applications to this question.

Despite this awesome community many new users I speak to are often put off sharing their work because of the high level of vizzes out there. They worry their work simply isn’t up to scratch because it doesn’t offer the same level of complexity.

 

To be clear, the original Makeover Monday guidelines did include the guideline that it was quite proper to just spend an hour fiddling around with it. But firstly, after a hard day battling against the dark forces of poor data quality and data-free decisions at work, it can be a struggle to keep on trucking for another hour, however fun it would be in other contexts.

And that’s if you can persuade your family that they should let you keep tapping away for another hour doing what, from the outside, looks kind of like you forgot to finish work. In fact a lot of the worship I have for the zens is how they fit what they do into their lives.

But, beyond that, an hour is not going to be enough to “compete” with the best of what you see other people doing in terms of presentation quality.

I like to think I’m quite adept with Tableau (hey, I have a qualification and everything :-)), but I doubt I could create and validate something like this beauty using an unfamiliar dataset on an unfamiliar topic in under an hour.

 

It’s beautiful; the authors of this and many other Monday Makeovers clearly have an immense amount of skill and vision. It is fascinating to see both the design ideas and technical implementation required to coerce Tableau into doing certain non-native things. I love seeing this stuff, and very much hope it continues.

But if one is not prepared to commit the sort of time needed to do that regularly to this activity, then one has to try and get over the psychological difficulty of sharing a piece of work which one perceives is likely to be thought of as “worse” than what’s already there. This is through no fault of the MakeoverMonday chiefs, who make it very clear that producing a NYT infographic each week is not the aim here – but I certainly see why it’s a deterrent from more of the data-downloaders uploading their work. And it’s great to see that topic being directly addressed.

After all, for those of us who use Tableau for the day-to-day joys of business, we probably don’t rush off and produce something like this wonderful piece every time some product owner comes along to ask us an “urgent” question.

Instead, we spend a few minutes making a line chart, that gives them some insight into the answer to their question. We upload an interactive bar chart, with default Tableau colours and fonts, to let them explore a bit deeper and so on. We sit in a meeting and dynamically provide an answer to enable live decision-making that before we had tools like this would have had to wait a couple of weeks to get a csv report on. Real value is generated, and people are sometimes even impressed, despite the fact that we didn’t include hand-drawn iconography, gradient-filled with the company colours.

Something like this perhaps:

Yes, it’s “simple”, it’s unlikely to go Tableau-viral, but it makes a key story held within that data very clear to see. And its far more typical of the day-to-day Tableau use I see in the workplace.

For the average business question, we probably do not spend a few hours researching and designing a beautiful colour scheme in order to perform the underlying maths needed to make a dashboard combining a hexmap, a Sankey chart and a network graph in a tool that is not primarily designed to do any of those things directly.

No-one doubts that you can cajole Tableau into such artistry, and there is sometimes real value obtainable by doing so,  or that those who carry it out may be creative geniuses -but unless they have a day job that is very different than that of mine and my colleagues, then I suspect it’s not their day-to-day either. It’s probably more an expression of their talent and passion for the Tableau product.

Pragmatically, if I need to make, for instance, a quick network chart for “business”, then, all other things being equal, I’m afraid I’m more likely I get out a tool that’s designed to do that rather than take a bit more time to work out how to implement it in Tableau, no matter how much I love it (by the way, Gephi is my tool of choice for that – it is nowhere near as user friendly as Tableau, but it is specifically designed for that sort of graph visualisation; also recent versions of Alteryx can do the basics). Honestly, it’s rare for me that these more unusual charts need to be part of a standard dashboard; our organisation is simply not at a level of viz-maturity where these diagrams are the most useful for most people in the intended audience, if indeed they are for many organisations.

And if you’re a professional whose job is creating awesome newspaper style infographics, then I suspect that you’re not using Tableau as the tool that provides the final output either, more often than not. That’s not its key strength in my view; that’s not how they sell it – although they are justly proud of the design-thought that does go into the software in general. But if paper-WSJ is your target audience, you might be better of using a more custom design-focused tool, like Adobe Illustrator (and Coursera will teach you that specific use-case, if you’re interested).

I hope nothing here will cause offence. I do understand the excitement and admire anyone’s efforts to push the boundaries of the tool – I have done so myself, spending way more time than is strictly speaking necessary in terms of a theoretical metric of “insights generated per hour” to make something that looks cool, whether in or out of work. For a certain kind of person it’s fun, it is a nice challenge, it’s a change from a blue line on top of an orange line, and sometimes it might even produce a revelation that really does change the world in some way.

This work surely needs to be done; adherents to (a bastardised version of) Thomas Kuhn’s theory of scientific revolutions might even claim this “pushing to the limits” as one of the ways of engendering the mini-crisis necessary to drive forward real progress in the field. I’m sure some of the valuable Tableau “ideas“, that feed the development of the software in part, have come from people pushing the envelope, finding value, and realising there should be an easier way to generate it.

There’s also the issue of engagement: depending on your aim, optimising your work for being shared worldwide may be more important to you than optimising it for efficiency, or even clarity and accuracy. This may sound like heresy, and it may even touch on ethical issues, but I suspect a survey of the most well-known visualisations outside of the data community would reveal a discontinuity with the ideals of Stephen Few et al!

But it may also be intimidating to the weary data voyager when deciding whether to participate in these sort of Tableau community activities if it seems like everyone else produces Da Vinci masterpieces on demand.

Now, I can’t prove this with data right now, sorry, but I just think it cannot be the case. You may see a lot of fancy and amazing things on the internet – but that’s the nature of how stuff gets shared around; it’s a key component of virality. If you create a default line chart, it may actually be the best answer to a given question, but outside a small community who is actively interested in the subject domain at hand, it’s not necessarily going to get much notice. I mean, you could probably find someone who made a Very Good Decision based even on those ghastly Excel 2003 default charts with the horrendous grey background if you try hard enough.

excel2003

Never forget…

 

So, anyway, time to put my money where my mouth is and actually participate in MakeoverMonday. I don’t need to spend even an hour making something if I don’t want to, right?  (after all, I’ve used up all my time writing the above!)

Tableau is sold with emphasis on its speed of data sense-marking, claiming to enable producing something reasonably intelligible 10-100x faster than other tools. If we buy into that hype, then spending 10 minutes of Tableau time (necessitating making 1 less cup of tea perhaps) should enable me to produce something that it could have taken up to 17 hours to produce in Excel.

OK, that might be pushing the marketing rather too literally, but the point is hopefully clear. For #MakeoverMonday, some people may concentrate on how far can they push Tableau outside of its comfort zone, others may focus on how they can integrate the latest best practice in visual design, whereas here I will concentrate on whether I can make anything intelligible in the time that it takes to wait for a coffee in Starbucks (on a bad day) – the “10 minute” viz.

So here’s my first “baked in just 10 minutes” viz on the latest MakeoverMonday topic – the growth of the population of Bermuda. Nothing fancy, time ran out just as I was changing fonts, but hey, it’s a readable chart that tells you something about the population change in Bermuda over time. Click through for the slightly interactive version – although of course, it, for instance, has the nasty default tooltips, thanks to the 10 minutes running out just as I was changing the font for the chart titles…

Bermuda population growth.png

 

 

#VisualizeNoMalaria: Let’s all help build an anti-Malaria dataset

As well as just being plain old fun, data can also be an enabler for “good” in the world. Several organisations are clearly aware of this; both Tableau and Alteryx now have wings specifically for doing good. There are whole organisations set up to promote beneficial uses of data, such as DataKind, and a bunch of people write reports on the topic – for example Nesta’s report “Data for good“.

And it’s not hard to get involved. Here’s a simple task you can do in a few minutes (or a few weeks if you have the time) from the comfort of your home, thanks to a collaboration between Tableau, PATH and the Zambian government: Help them map Zambian buildings.

Whyso? For the cause of eliminating of the scourge of malaria from Zambia. In order to effectively target resources at malaria hotspots (and in future to predict where the disease might flare up); they’re

developing maps that improve our understanding of the existing topology—both the natural and man-made structures that are hospitable to malaria. The team can use this information to respond quickly with medicine to follow up and treat individual malaria cases. The team can also deploy resources such as indoor spraying and bed nets to effectively protect families living in the immediate vicinity.

Zambia isn’t like Manhattan. There’s no nice straightforward grid of streets that even a crazy tourist could understand with minimal training. There’s no 3d-Google-Earth-building level type resource available. The task at hand is therefore establishing, from satellite photos, a detailed map of where buildings and hence people are. One day no doubt an AI will be employed for this job, but right now it remains one for us humans.

Full instructions are in the Tableau blog post, but honestly, it’s pretty easy:

  • If you don’t already have an OpenStreetMap user account, make a free one here.
  • Go to http://tasks.hotosm.org/project/1985 and log in with the OpenStreetMap account
  • Click a square of map, “edit in iD editor”, scan around the map looking for buildings and have fun drawing a box on top of them.

It may not be a particularly fascinating activity for you to do over the long term, but it’s more fun than a game of Threes – and you’ll be helping to build a dataset that may one day save a serious amount of lives, amongst other potential uses.

Well done to all concerned for making it so easy! And if you’ve never poked around the fantastic collaborative project that is OpenStreetMap itself, there’s a bunch of interesting stuff available there for the geographically-inclined data analyst.

 

Is the EU referendum actually a great conspiracy?

(Sorry to anyone bored by the great/hideous Brexit referendum – this is the last post on the topic, well, at least until the event actually happens 🙂 )

Today is the day!  All us UK citizens can cast our direct-democracy vote as to whether the UK should remain in the EU, or say goodbye. It’s been a long, torrid, at times revolting, journey in terms of output from the campaigners, politicians and media. “It is as though the sewers have burst”, said Nick Cohen in the Observer, somewhat accurately. But the vote is today and it’ll therefore all be over soon.

Or will it? Yougov have surveyed on many, many EU referendumy topics. One of the latest included questioning respondents on various conspiracy-esque statements about the result of the referendum. I don’t use the word “conspiracy” in a necessarily derogatory tone – some perceived “conspiracies” turn out to be true, although many do not.

Anyway, here were the statements offered up to the public to pronounce on whether they thought they were probably true, probably false or don’t know.

  • There are plans for further EU integration and enlargement that the EU are deliberately not announcing till after the referendum
  • The BBC & ITN are not commissioning an exit poll in order to allow the vote to be fixed without anyone telling
  • MI5 is working with the UK government to try and stop Britain leaving the EU
  • It is likely that the EU referendum will be rigged

I have listed them in my perception of order of seriousness, although several are open to interpretation regarding the scope and intentionality they imply. The first just relates to the timing of announcing EU events, the last implies the literal undermining of the entire democratic process, implying a pointless referendum beholden to corrupt, criminal actors.

But what did the respondents think of these? Did anyone seriously think that MI5 spies are secretly influencing the result? (*) That the whole referendum is a fraudulent scam?

(*) Well, it’s not quite MI5, but when the Conservative peer Baroness Warsi recently changed her view from Leave to Remain, there were people suggesting she was a  secret Remain campaign plant all this time. Amongst other far more horrific diatribes that I am reluctant to reproduce on this site.

Well, it turns out the answer is yes, a fair amount of people do agree with these statements. Please click through and interact with the visualisation below in order to see the proportion of people agreeing with each statement, with the ability to break it down by age, gender, social grade, region, which political party they voted for in the 2015 general election, and – perhaps most interestingly -how they reported that they intend to vote for in the EU referendum itself: leave vs remain.

EU referendum conspiracy theory poll2

 

A few things I noticed:

There’s a sizeable amount of people that agree with every one of those statements. That’s not to say that they are the same single cohort of people in each case, as the data is too high level to determine that, but every statement has at least 15% of people in favour. There’s not one statement that over half the surveyed people thought was probably false. Not one.

To take perhaps the most dramatic one – nearly a third of the surveyed population think that it’s likely that the EU referendum will be rigged. If this implies “direct” rigging i.e. fiddling with the results, then this is quite a terrifying indictment on our view of the legitimacy of our democratic process.

Sidenote: There does seem to be a movement to “bring your own pen” to the voting stations today, under the premise than the pencils that poll booths traditionally offer leave marks on the ballot papers that could be easily erased and replaced. Although this seems like one of the most annoying and time consuming ways I could imagine of fixing an election result! If you’re going to believe in an over-arching conspiracy here, then I suspect MI5 could have far more efficient methods…

When splitting by demographics and behaviour, clear differences emerge. Flicking through the interactive version will show you the full details not represented in the below text – but in summary, for most statements:

  • A fairly similar proportion of females and males believe they are true. But for those that don’t, females are more likely to say they don’t know whereas males are more likely to go for probably false.
  • Those of social grades ABC1 are generally less likely to think any of the statements are probably true than C2DE, and more likely to think they’re probably false.
  • There is a strong difference in the beliefs of the voters based on whether they’re likely to vote for Leave or Remain. Without exception, the Leavers are more likely to think the statements are probably true than the Remainers.  The proportion of Leavers who think the referendum is likely rigged is over four times the proportion of Remainers.

    EU referendum conspiracy theory poll

  • Digging down deeper into (the somewhat correlated, but not fully so) variable of which political party they supported in the 2015 election, there is one hugely obvious outlier. Those who voted UKIP are way more likely to agree with the statements than others, particularly regarding whether the EU referendum will be rigged. A majority, nearly two thirds, of UKIP voters believe this to be true, in comparison to between 14 and 23% of voters for other parties.

So, what does this mean?

Well, it shows a distinct lack of faith in the system set up for this referendum and trust in the “powers that be” – which is perhaps somewhat understandable, considering the ways the various campaigns have been run.

At first glance, the sheer level of disbelief in the overall integrity of the system seems a notable unhealthy sign of the times though – although I would like to see similar stats taken over previous years in order to determine whether the figure of 28% believing the referendum will be rigged is “normal” for every year. If so, it could certainly explain the non-amazing turnout the UK generally sees in elections.

…except that there’s a curious interaction regarding voting intention, political party and turnout. In a previous post here, we saw that UKIP supporters are one of the subsections of society that appear to be most likely to say that they will turn up and participate in the referendum. However, this is also the segment that is by far the most sceptical of the result being legitimate. UKIP was likely also one of the driving forces that led towards the referendum being called in the first place: if there was no visible block of desire to leave the EU, an issue that UKIP was originally set up to dedicate itself to, then there would have been no reason for a referendum.

That’s not to say other political parties don’t have members with anti-EU views in them, who are individually in places where they might be expected to have a higher influence in political shenanigans than the average UKIP candidate. Two of the highest profile Leave campaigners, Boris Johnson and Michael Gove, are both high-up members of the Conservative party.

But, simplistically, the people who most demanded the referendum and are most likely to go and vote in it also seem to be the people who are least likely to believe its results. Are we seeing a political form of Pascal’s wager?!

It would also suggest that no matter what the result is, the debate will be far from over. Particularly if the result goes to remain, it seems like nearly half of those voting to leave may feel that it has been rigged (of course people are likely forgive rigging more if it produces the answer they want). And even if it goes to Leave, one in ten Remainers are seemingly sceptical of its legitimacy already, which is a sizeable number of people who, even without the psychology surrounding losing a vote to those with different beliefs, believe that the entire system is invalid.

So recently we have learned:

Hey, it’s almost as if it isn’t really the time or place for such a consequential question about the future of the UK to be determined in this manner. Is it too late to call the whole thing off? (answer: yes, I guess it is).

Brexit: Which newspapers support Leave and which Remain?

Being a glutton for punishment, another Brexit question struck me. Which newspapers are formally standing in the Leave camp, and which in the Remain?

This question might strike you as beyond obvious based on the typical political outlook they adhere to and the output of their columnists – but it turns out it’s not as straightforward as I imagined.

Please feel free to click through and interact with the below dashboard. In the full version you can use a dropdown selector to colour code the marks based on who owns the paper, its general political outlook and which party it supported in the 2015 UK general election.

Where do newspapers officially stand on Brexit

A couple of things stood out to me:

  • Right now, the big arguments for Leave are coming to us tinged (well, totally submerged in) with arguments appealing to the right wing of the political spectrum. However, there are papers who typically hold right-wing views that are pro Remain, albeit a minority. All the more left-wing papers that have declared are pro-Remain.
  • In fact even within papers owned by the same organisation / person, it can be that some back Leave and some Remain.

    The big shocker to me here was the Mail on Sunday backing Remain. One of the big scare campaigns from Remain boils down to “dreadful immigrants will come and eat your children if you don’t vote Leave”. The Mail on Sunday famously loves this sort of stuff – a 5-second Google found “Free hotels for the Calais stowaways in soft touch Britain” as a prime example of what they publish.

    Now, whether this is proprietors hedging their bets, or decisions made at an editor rather than proprietor level I do not know – but it’s not quite what I expected. You can see the same sort of division in the Murdoch papers too.
    Capture