Tableau software is now free for eligible non-profit organisations and educators

Heart-warming news for non-profits who want to save the world via data analysis from Tableau via PRNewsWire:

Tableau Foundation, part of Tableau Software’s (NYSE: DATA) corporate social responsibility (CSR) program, announced today that it will offer free licenses of Tableau Desktop to nonprofit organizations

Qualification criteria is currently listed by Tableau as follows:

  • Be registered as a 501(c)(3) organization (global availability coming soon)
  • Operate with an annual budget of $5 million or less
  • Not be a school, college, mutual organization, healthcare organization, or government agency

If you’re an educational facility then you might already be able to get some free copies courtesy of the longer-standing Tableau’s academic program (but I believe only for use in the classroom, and not backoffice school analytics – this is kind of a shame as UK schools currently seem in quite desperate need of saving money and having decent data analysis tools behind the scenes as well as in-classroom!).

For those that are eligible and interested, the above is a very generous offer you shouldn’t turn down! For reference, Tableau Desktop Professional normally retails around $2k. The tool is easy enough to use, with a great set of online resources and large helpful community, such that you can get by without paying for training.

How to map geographies in Tableau that are not built in to the product (e.g. UK postcodes, sales areas)

Tableau has a nice ability to create point (“symbol”) or polygon (“filled”) maps to visualise spatial data on. However, in order to do this, it of course needs to understand where each point you wish to plot is in the real world.

Several of the most common geographic identifiers are built in, such as country, region and sometimes postcode part. A list and some instructions as to how to use can be seen near the bottom of this article in the section called “Built in geographic roles”. If you data fits into this format it is extremely easy to use.

However, especially outside of the US, the geospatial data coverage is not amazing. For instance, Tableau cannot immediately understand Sweden’s postcodes at all. In the UK, it knows the first half of the postcode (“AB1”) but not the full postcode (“AB1 2CD”). It also sometimes knows the centroids of, for example, postcodes which allows the construction of a point-based map, but not the polygons needed for a shaded map.

Of course it also does not know your own custom geographies, such as sales territories, drivetimes around stores and so on.

Although it’s usually not nearly as user friendly or fast as using the built-in mapping geographies, there are several workarounds or alternatives if you need to map something not built in. Here are some options I’ve successfully used in the past. Tributes need to be paid to the Information Lab, a consultancy which has some amazingly useful Tableau mapping information and insights on their blog for a lot of these

Some options described below:

Option 1: Geocode your source data

Tableau can always plot any latitude and longitude co-ordinate you give it, so if you can arrange for your datasource to include that on each record then it is very simple to select those and have them plotted as points. Here’s some very quick instructions.

If your datasource already has longitude at latitude in it then you’re sorted for point-based mapping, just use that verbatim. If not, you would need to add the co-ordinates into your dataset somehow.

There are plenty of free websites to help find the co-ordinates if you need them if you have the street address in text form (e.g. 1, 2 – these suggestions are not necessarily recommendations! Google will find you lots if those aren’t good), plus some “desktop GIS” packages can do that.

A challenge will be if you have a long list of co-ordinates you need to find such that these semi-manual efforts are not efficient. Desktop GIS tools or specialised analytics packages like Alteryx can often do that on a mass scale, albeit sometimes with a need for expensive data licenses.

If you don’t want the expense/bother/tedium of inhouse solutions, you can also find several companies that offer you the chance to send them a long file of addresses and it will return the co-ordinates. Be warned that usually they will charge a financial cost and require uploading potentially sensitive personal information. However, the cost may be more than worth facing if you have a requirement for large volumes of geocoding in terms of time saved.

Option 2: Join / blend / “Tableau custom geocode” geographic co-ordinates to your data

This is similar to option 1, but rather than geocode every record in your database, instead have a lookup table with the appropriate identifier  (e.g. postcode) and a latitude/longitude co-ordinate. There are then various options to combine this with your source data and and have Tableau plot your records on the map.

  • Tableau “custom geocoding” is probably the most flexible and reusable option after setting it up – instructions here. You are essentially creating a new geographic category within Tableau, so as well as knowing where the country of Spain is, it might also know where the airport of Heathrow is. Note though that this can only by default give you “point” mapping rather than “polygons” – so you could pop a dot in the middle of your sales area, but you can’t have it draw the boundaries like Tableau can do for countries. However, there is a workaround for this – see option 5.
  • Database join the your lat/long lookup file with your source file, e.g. a JOIN in a database, a VLOOKUP in Excel source, etc. such that every source record has an associated co-ordinate
  • Tableau blend the lat/long lookup file with your data, if it is in a suitable format. (See here for some notes regarding deciding between joining and blending data.)

The limitation here is often granularity. If you have a lookup table which has 1 co-ordinate per postcode, then all the records with a certain postcode will appear in the same place, rather than if you had geocoded a precise street address where the level of detail is much higher (although often unnecessary for analysis).

An example file format for this might be:

Postcode Latitude Longitude3
12525 63.42124 51.124123
12526 63.41211 50.123212
12527 62.51622 50.98271

If you don’t already have this sort of lookup data then there are various places to find it.

  • For official postal address information, often the country’s government or main postal service is the supplier of this data. This often has a cost associated with it (although it might not be much) but is usually the most accurate and timely.
  • Various agencies/consultancies can supply this type of information: usually at a more expensive cost.
  • Various websites / open data initiatives publish it for free, which Google searches will quickly find. One of the first that comes up today is Geonames for instance (this is not necessarily a recommendation). But as ever, with free public information, if it is not from an authoritative source there is some likelihood it is not entirely accurate or complete so it would be good practice to test it first. Be sure especially to check the granularity and recency of the geocoding, as a lot of this free data may lack a formal code-book.

Option 3: 3rd party mapping files designed for Tableau

Tableaumapping.bi is a great site for free downloads of mapping data, both points and polygons, that are designed to work with Tableau. These are provided by various kind people free of charge I believe (organised by the super folk at the Information Lab) so accuracy and updates are probably not guaranteed. However, whenever I have had cause to use them I have enjoyed the experience.

As they are designed specifically to be used in Tableau it should be a relatively easy job to get them working. Below is a video they have showing how to use them.

There are also commercial companies that create these type of things for a fee. Mapbox is a mapping company that works with Tableau files for instance.

Option 4 – convert a “shape file”

Most of the above options will result in a “point” datasource rather than a polygon one. One of the most common types of polygon mapping data formats is a shape file (“.shp”). Again you can often find these files freely downloadable from the web (above warnings re accuracy and updatedness may apply. Here is one site that offers a collection of links (I have not tested them) or purchasable / supported from professional GIS data shops/consultancies at a cost.

Shape files are normally used in GIS systems and general analytics products like Tableau or Excel often can’t read them. However it is possible to convert a shape file into a file Tableau can use. Google searches will, as usual, give you several methods of doing so, but perhaps the easiest and cheapest way I found so far if your shape file is not huge is to use the converter available in the free Alteryx Analytics Gallery.

You upload your shape file. It will then convert it and give you a file to download that works in Tableau. You can then use the same technique as shown in option 3 above to use it in Tableau. You do have to register with the gallery but it’s free of charge. Note any restrictions you have about uploading personal data to 3rd parties if appropriate.

The main limitation of that method will be if you have a large or complicated shapefile it will may not work over the web. The Alteryx desktop software itself can process almost any size file relatively quickly, and you can get a 14 day demo here. After the demo expires though you cannot use that feature any more without buying the full software package.

Large shape files may also result in large Tableau polygon data which can make your visualisations very slow with huge file sizes. If this is the case then, if you can sacrifice accuracy, opting to “generalise” the polygons in the conversion process can make a big difference. This reduces the number of polygon points, meaning it is less precise, but much faster, which is a good trade-off for many spatial tasks.

Option 5: modify Tableau’s geocoding database to put your own custom polygons in as an option.

This is really a special case of option 2’s first bullet point, but allows you to create easily reusable custom mapping polygons that behave in the same way as the default Tableau country, region etc. type datasets do. It’s another cunning mapping trick discovered by the geniuses at the Information Lab which uses Alteryx to rewrite the database Tableau uses for its custom geocoding.

As per option 4, you need a shapefile and Alteryx to get this one working, full instructions provided by Craig Bloodworth here. However if you don’t have either of those pre-requisites, or don’t fancy taking the time to follow the instructions, then they kindly provide some several useful basics like UK postcode areas and continents here and here.

Are station toilets profitable?

After being charged 50p for the convenience of using a station convenience, I became curious as to whether the owners were making much money on this most annoying expression of a capitalistic monopoly high on the needs of many humans.

It turns out data on those managed by Network Rail is available in the name of transparency – so please click through and enjoy interacting with a quick viz on the subject.

Train station toilet viz

Microsoft Academic Graph: paper, journals, authors and more

The Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals and conference “venues” and fields of study.

Microsoft have been good enough to structure and release a bunch of web-crawled data around scientific papers, journals, authors, URLs, keywords, references between and so on for free here. Perfect for understanding all sorts of network relationships between these nodes of academia.

The current version is 30gb of downloadable text files. It includes data on the following entities.

  • Affiliations
  • Authors
  • ConferenceSeries
  • ConferenceInstances
  • FieldsOfStudy
  • Journals
  • Papers
  • PaperAuthorAffiliations
  • PaperKeywords
  • PaperReferences
  • PaperUrls

Being webscraped and coming with a warning that it has only been minimally processed, they do instruct users to beware that the quality is not perfect – but it’s apparently the biggest chunk of bibliographic data like this that has been released for the public to do what it will with.

Kruskal Wallis significance testing with Tableau and R

Whilst Tableau has an increasing number of advanced statistical functions – a case in point being the newish analytics pane from Tableau version 9 – it is not usually the easiest tool to use to calculate any semi-sophisticated function that hasn’t yet been included.

Various clever people have tried to work some magic aroud this, for instance by attempting “native” Z-testing. But in general, it’s sometimes a lot of workaroundy effort for little gain. However, a masterstroke of last year’s Tableau version was that is included the ability to interface with R.

R is a statistical programming language that really does have every statistical or modelling function you’ll ever need. When someone invents new ones in the future, I have no doubt an R implementation will be forthcoming. It’s flexibility and comprehensiveness (and price…£0) is incredible. However it is not easy to use for the average person. For a start, the default interface is a command line, and – believe it or not – there are people alive today who never had the singular joy of the command line being the only way to use a computer.

It must be said that in some ways Tableau has not really made R super easy to use. You still need to know the R language in order to use R functionality inside Tableau. But using the Tableau interface into R has 2 huge benefits.

  1. It integrates into your exist Tableau workspace. The Tableau interface will push any data you have in Tableau, with all your filters, calculations, parameters and so on to R and wait for a result. There’s no need to export several files with different filters etc. and read.csv(“myfile1.csv”) them all into R. Once you’ve set up the R script, it reruns automatically whenever your data changes in Tableau.
  2. It visualises the results of the R function immediately and dynamically, just like any other Tableau viz. R has a lot of cool advanced graphics stuff but it can be hard to remember quite how to use and the most of the basic components offer no interactivity. No drag and drop pills or “show me” to be seen there!

So recently when I needed to perform a Kruskal Wallis significance test over data I was already analysing in Tableau, this seemed the obvious way to do it.

Kruskal Wallis is fully explained in Wikipedia .To over-simplify for now, it allows you to detect whether there is a difference in data when it comes from multiple groups – like ANOVA more famously does, but KW doesn’t require data to be parametric.

This is classic hypothesis testing. With my questionnaire, we start with the hypothesis that there is no difference in the way that different groups answer the same question, and see if we can disprove that.

In honesty, the fact it’s a KW test is not all that important to this post. Using R to perform a certain test in Tableau is pretty much the same as it would be for any other test, aside from needing to know the name and syntax of the test in R. And Google will never let you down on that front.

In this case, I wanted to analyse questionnaire results scored on a Likert-type scale to determine on several dimensions whether the answering patterns were significantly different or not. For instance, is there a real difference in how question 1 is answered based on the respondents age? Is there a real difference in how question 2 is answered based on respondents country of origin? How confident can we be that any difference in averages truly represents a real difference and is not the results of random noise?

I wanted to do this for about 30 questions over 5 different dimensions; a total of 150 Krusal-Wallis tests, and I didn’t overly fancy doing it manually.

Had I got only a copy of pure R to hand, this might be how I would have tackled the question – assuming that I output the results of question 1 dimensioned by groups in a CSV file before, in a format like this:

RespondentID Group Score
R1 Group A 4
R2 Group B 5
R3 Group B 5
R4 Group A 3
R5 Group C 1


setwd('../desktop/questionnaire')
q1_data<-read.csv('q1_responses.csv')
q1_dataframe<-setNames(data.frame(q1_data[3], q1_data[2]), c('scores','groups'))
kruskal.test(scores ~ groups, data=q1_dataframe)

This would give me output like the below.


Kruskal-Wallis rank sum test
data: scores by groups
Kruskal-Wallis chi-squared = 2.8674, df = 2, p-value = 0.2384

To understand the syntax and what it returns, see the man page for kruskal.test here.

But to pick a singular point, it shows case shows that the “p-value”, the probability of the differences in scores between the groups being purely down to the random fluctuations of chance, is 0.2384, aka 24%. Numbers above 5-10% are often held to mean that any difference seen has not been statistically proven. Here there’s about a 1 in 4 chance that any differences in these answers are purely due to random chance.

There are endless discussions/controversies as to the use of P values, and statistical vs “clinical” differences – but this is the end point we are aiming for right now.

I could then have done the above process for the other 149 combinations I wanted to test and constructed a table of P results (or scripted to do the same), but it seemed more fun to use Tableau.

First you have to set up Tableau to interface with R correctly. This means running an R-server (called Rserve, handily enough) which is of course also free, and can be started from within the R interface if you have it open. Here’s Tableau’s handy instructions  but it can be boiled down to:
install.packages("Rserve")
library(Rserve)
Rserve()

A few notes:

  • It’s fine to run the Rserve on the same computer as Tableau is on.
  • You only have to enter the first line the first time you use Rserve.
  • Capitalisation of Rserve matters for some commands!
  • If you are going to use Rserve a lot more than you use R you can start it from an exe file instead, as detailed in the above Tableau article.
  • Workbooks using R will not work in Tableau Reader or Tableau Online.

Then carry on the process of setting up R as per the Tableau article. Notably you have to look in the Help menu (a strange place for it perhaps) for “Settings and performance” then “Manage R connection” and then type in the name of your Rserve host. If you’re running it on your computer then choose “localhost” with no user/password.

R connection

Next, assuming you have your Tableau workbook already set up with your questionnaire data, create a new worksheet which filters to show only a single question, and has each unique person that answered that question and the grouping variable on the rows. This layout is important, as will be discussed later.

Layout

Now it’s time to write the formula that interfaces between Tableau and R.

This Tableau blog post explains in detail how to do that in general – but the essence is to use a calculated field with one of the Tableau SCRIPT_ functions that corresponds to the type of result (string, integer etc.) that you want to retrieve from R. You then pass it literal R code and then use placeholder variables .arg1, .arg2…argN to tell it which data you want Tableau to send to R.

The calculated field will then send the R script with the relevant data inserted as a R vector into the placeholders to the R server, and wait for a result to be returned. The result can then be used inside Tableau as, just as any other table calculation can .

For the Kruskal Wallis test, when we retrieve the P value from a KW test, we saw above that it is in a decimal number format. This is what Tableau calls a “real” number, so we use the SCRIPT_REAL function.

Within that function we enter R code resembling the last line in my R-only attempt, which was the one that actually did the calculations.

You can put more than one R code line in here if you want to. However note that a Tableau R function can only handle returning a single value or vector by default. Here I have therefore specifically asked it to return just the p value as it would be hard to handle the general table of results and associated text. Read the R documentation for any function to understand which other types of value are possible to return from that function.

So, in Tableau, here’s the calculated field to create.

Kruskal Wallis p score:


SCRIPT_REAL(
"
kruskal.test(.arg1~.arg2, data=data.frame(.arg1,.arg2 ))$p.value
",
AVG([Score]),ATTR([Group]))

If you compare it to my first R script, you can see that I want it to replace “.arg1” with the first argument I gave after the R script (AVG([SCORE]) and “.arg2” with the GROUP. It’s safe, and necessary, to use ATTR() as you have to pass it an aggregation. Because we set the layout to by 1 row per respondent we know each row will have only a single group associated with that respondent (assuming the relationship between respondents and groups is 1:many, which it needs to be for a successful KW test).

You could use .arg3, arg4… etc. if you needed to pass more data to R, but that’s not necessary for the Kruskal Wallis test.

You can then double click your new calculated field and it’ll appear in the text pill and display the p value we saw above (remember, that 0.2384 number from above? Horray, it matches!) repeatedly, once per respondent. The nature of the KW test is that it returns one value that describes the whole population.

KW test
Being Tableau we can filter to show another single (see below) question, change the respondent pool, highlight, exclude and do all sorts of other things to the data and the new P value for the given breakdown is immediately shown (subject to the rules of table calculations…remember that SCRIPT_… Functions are table calculations!)

But what if we want the Kruskal P value for all the questions in the survey in one table, and not have to iterate manually through every question?

We can’t just unfilter or add questions willy-nilly, because a single p value will be calculated for the whole dataset of ( question 1 + question 2 ) responses, instead of one being calculated for ( question 1) and another for ( question 2 ). This is unlikely to be useful.

KW whole population

See how there’s no 0.2384 value in the above. Instead the same number is repeated for every single question and respondent. It’s compares the scores as though they were all for the same question.

By virtue of the SCRIPT_ Tableau functions being a table calculation though, this one is relatively easy to solve.

Right clicking on the R calculation field gives the normal Edit Table Calculation option. Pick that, and then select “Advanced” under Compute Using.

Simplistically, we want to partition the P values that are calculated and returned to 1 per survey question, so put “Question” in the partition box and the other fields into the addressing box.

Advanced compute by

Hit OK, Tableau refreshes its R stuff, and all is good. That 0.2384 is back (and the 0 for question two is the result of rounding a number so small that the p value shows significance at all practical levels.)
1 row per response

However, we’re getting 1 row for every respondent to every question instead of a nice summary of one row per question.

We can’t just remove the respondent ID from the Tableau view because, whilst this gives the layout we want, the SCRIPT_REAL calculation will pass only the aggregated values of the respondent data to R, i.e. one average value per question in this case, not 1 per person answering the question.

This will give either an error or a wrong answer, depending on which R function you’re trying to call. In this case it would be passing the average response to the question to R to perform the KW test on, whereas R needs each individual response to the question to provide its answers.

The solution to this issue is the same as the classic trick they teach in Tableau Jedi school that allows data to be dynamically hidden yet remain within scope of a table calculation. Namely that when you use a table calculation as a filter, the calculations are mostly done before a filter is applied, in contrast to a non-table calculation where the calculations are done on the post-filtered data.

So, create a simple dummy table calculation that returns something unique for the first row. For instance, create a calculated field just called “Index” that has the following innocuous function in it.


INDEX()

Now if you put that on filters and set it to only show result “1” (the first result) you will indeed get 1 result shown – and the 0.2384 P value remains correct.

1 row

But we actually wanted the first result per question, right, not just the single first result?

Luckily our index function is also a table calculation, so you can do exactly the same compute-by operation we did about to our R-based field.

Right click the index filter pill, go back to “Edit table calculation”, and fill the Compute Using -> Advanced box in in the same way as we previously did to our R calculation function, putting “Question” into the partition box and the rest into addressing.

Tada!

1 row per question

In the interests of neatness we can hide the useless respondent ID and group columns as they don’t tell us anything useful. Right click the respective Rows pill and click ‘Show header” to turn the display off, whilst leaving it in the pills section and hence level of detail.

Clean table

Bonus tips:

  • You can send the values of Tableau parameters through the R function, controlling either the function itself or the data it sends to R – so for instance allowing a novice user to change the questions scored or the groups they want to compare without having to mess with any calculated fields or filters.
  • The R-enabled calculation can be treated as any other table calculation is in Tableau – so you can use it in other fields, modify it and so on. In this example, you could create a field that highlights questions where the test showed that the differences were significant with at 95% certainty like this.

Highlight 95% significance:


IF [Kruskal Wallis test p value] <= 0.05 THEN "Significant" ELSE "Not significant" END

Pop it on the colours card and you get:

Highlight significance

Now it’s easy to see that, according to Kruskal-Wallis, there was a definite difference in the way that people answered question 2 depending on which group they were in, whereas at this stringent level, there was no proven difference in how they answered question 1.

You can download the complete workbook here.

A first look at Alteryx 10’s Network Analysis tool

Network visualisation tool iconAlteryx version 10 was recently released, with all sorts of juicy new features in realms such as usability, data manipulation and statistical modelling. Perhaps one of the most interesting ones for me though is the new Network Analysis tool.

This provides an easy way to make network graph visualisations natively, something that many general purpose analytical tools don’t do (or require workarounds). Behind the scenes, it uses R, but, as per the other Alteryx R tools, you don’t need to worry about that.

Until now, I had used the Gephi for such work; it’s a great free open-source program which is tremendously capable at this style of analysis, but not always particularly friendly or easy to use, and requires data to be exported into it.

In a previous post I wrote about the basics of getting data into Gephi and visualising it. The very simple example I gave there is easily replicable in Alteryx. Here’s how:

First create your tables of nodes (the dots) and edges (the lines between the dots).

The documentation states that your nodes must have a unique identifier with the fieldname of “_name_” and the edges must have fields “from” and “to”. Actually in practice I found it often works fine even without using those specific field names, but it is to rename columns in Alteryx (use the Select tool for instance) so one might as well follow the instructions where possible.

So for a basic example, here’s our table of nodes:

_name_ label Category
1 A Cat1
2 B Cat1
3 C Cat1
4 D Cat2
5 E Cat2
6 F Cat2
7 G Cat3
8 H Cat3
9 I Cat3
10 J Cat3

And edges:

From To
1 2
1 3
1 4
1 7
1 9
2 8
2 7
2 1
2 10
3 6
3 8

Pop a “Network Analysis” tool onto the canvas. It’s in the Predictive section of the Alteryx toolbar. Then hook up your nodes file to the N input and edges file to the E input.

Alteryx network viz workflow

There’s some configuration options on the Network Analysis tool I’ll mention briefly shortly, but for now, that’s it, job done! Press the run button and enjoy the results.

The D output of the tool gives you a data table, 1 row per node, showing various graph-related statistics per node: betweenness, degree, closeness, pagerank and evcent. You can then directly use these statistics later on in your workflow.

The I output gives you a interactive graphical representation of your network with cool features like ability to search for a given node, tooltips upon hover, click to drag/highlight nodes, some summary stats and a histogram of various graph statistics that describe the characteristics of your network like this:

Capture

Although for most tools the “auto-browse” function of Alteryx 10 negates the need for a Browse tool, you will need one connected to the I output if you want to see the graphic representation of your network.

There are some useful configuration options in the Network Visualisation tool itself in 3 categories; nodes, edges and layout.

Perhaps the 3 most interesting ones are:

  • ability to size nodes either based on their network statistics or another variable,
  • ability to have directed (A connects to B, B might not connect to A) or undirected (A connects to B implies B connects to A) edges.
  • ability to group nodes by either network statistics or another variable (e.g. to differentiate between Facebook friends and Facebook groups).

Here for example is the above diagram where the nodes are sized by degree (# connections), coloured by my variable “Category” and the edges are set to directed.

Options for network viz tool

Network viz with options


Sidenote 1: There seems to be a trick to getting the group-by-variable to work though, which I’m not sure is intentional(?). I found that the tool would only recognise my grouping variables if they were specifically of type “String”.

Alteryx text from an input file usually defaults to type “V_string” but the Network Viz tool would not let me select my “Category” field to group nodes by if I left it at that. However it’s very easy to convert from V_string to String by use of a Select tool

Select tool to string

Sidenote 2: For people like me who are locked down to an old version of Internet Explorer (!) – the happy news is that the Alteryx network viz works even in that situation. In previous versions of Alteryx I found that the “interactive” visualisations tended to fail if one had an old version of IE installed.


Overall, the tool seems to work well, and is as quick and easy to use as users of Alteryx have probably come to expect. It even, dare I say it, has an element of fun to it.

It’s not going to rival – and probably never will try to – the flexibility of Gephi for those hand-crafting large complex networks with a need for in-depth customisation options and output. Stick with that if you need the more advanced features (or if you can’t afford to buy Alteryx!).

But for many people, I believe it contains enough features even in this first version to do the basics of what most analysts probably want a network viz for, and will save you hours in comparison vs finding and learning another package.

At least for relatively small numbers of nodes anyway; on my first try I found it hard to gain much insight from the display of a larger network as the viewing area was quite small – but some of this is innate to the nature of the visualisation type itself. I have also not yet experimented very much with the different layout options available, some of which might dramatically improve things if they have similar impact to the Gephi layout options. Picking the optimum location to display each node is a distinctly non-trivial task for software to do!

Remember also that as the “D” output gives a data table of network stats per node, one could always use that output to pre-filter another incarnation of the network viz tool and show only the most “interesting” nodes if that was more useful.

In general this new Alteryx tool is so easy to use and gives such quick results that I hope to see it promote effective use of such diagrams in real-world situations where they can be useful. At the very least, I’m sure it’ll convince a few new “data artisans” to give network analysis a try.