Books I read in 2017

Long term readers (hi!) may recall my failure to achieve the target I had of reading 50 books in 2016. I had joined the 2016 Goodreads reading challenge, logged my reading activity, and hence had access to the data needed track my progress at the end of the year. It turns out that 41 books is less than 50.

Being a glutton for punishment, I signed up again in 2017, with the same cognitively terrifying 50 book target – basically one a week, although I cannot allow myself to think that way. It is now 2018, so time to review how I did.

Goodreads allows you to log which books you are reading and when you finished them. The finish date is what counts for the challenge. Nefarious readers may spot a few potential exploits here, especially if competing for only 1 year. However, I tried to play the game in good faith (but did I actually do so?  Perhaps the data will reveal!).

As you go through the year, Goodreads will update you on how you are doing with your challenge. Or for us nerd types, you can download a much more detailed and useful CSV. There’s also a the Goodreads API to explore, if that floats your boat.

Similarly to last year, I went with the CSV.  I did have to hand-edit the CSV a little, both to fill in a little missing data that appears to be absent from the Goodreads dataset, and also to add couple of extra data fields that I wanted to track that Goodreads doesn’t natively support. I then popped the CSV into a Tableau dashboard, which you can explore interactively by clicking here.

Results time!

How much did I read

Joyful times! In 2017 I got to, and even exceeded, my target! 55 books read.

In comparison to my 2016 results, I got ahead right from the start of the year, and widened the gap notably in Q2. You can see a similar boost to that witnessed in 2016 around the time of the summer holidays, weeks 33-35ish. Not working is clearly good for one’s reading obligations.

What were the characteristics of the books I read?

Although page count is a pretty vague and manipulable measure – different books have different physical sizes, font sizes, spacing, editions – it is one of the few measures where data is easily available so we’ll go with that. In the case of eBooks or audio books (more on this later) without set “pages” I used the page count of the respective paper version. I fully acknowledge this rigour of this analysis as falling under “fun” rather than “science”.

So the first revelation is that this year’s average pages per read book was 300, a roughly 10% decrease from last year’s average book. Hmm. Obviously, if everything else remains the same,  the target of 50 books is easier to meet if you read shorter books! Size doesn’t always reflect complexity or any other influence around time to complete of course.

I hadn’t deliberately picked short books – in fact, being aware of this incentive I had tried to be conscious of avoiding doing this, and concentrate on reading what I wanted to read, not just what boosts the stats. However, even outside of this challenge, I (most likely?) only have a certain number of years to live, and hence do feel a natural bias towards selecting shorter books if everything else about them was to be perfectly equal. Why plough through 500 pages if you can get the same level of insight about a topic in 150?

The reassuring news is that, despite the shorter average length of book, I did read 20% more pages in total. This suggests I probably have upped the abstract “quantity” of reading, rather than just inflated the book count by picking short books. There was also a little less variation in page count between books this year than last by some measures.

In the distribution charts, you can see a spike of books at around 150 pages long this year which didn’t show up last year. I didn’t note a common theme in these books, but a relatively high proportion of them were audio books.

Although I am an avid podcast listener, I am not a huge fan of audio books as a rule. I love the idea as a method to acquire knowledge whilst doing endless chores or other semi-mindless activities. I would encourage anyone else with an interest of entering book contents into their brain to give them a whirl. But, for me, in practice I struggle to focus on them in any multi-tasking scenario, so end up hitting rewind a whole lot. And if I am in a situation where I can dedicate full concentration to informational intake, I’d rather use my eyes than my ears. For one, it’s so much faster, which is an important consideration when one has a book target!  With all that, the fact that audio books are over-represented in the lower page-counts for me is perhaps therefore not surprising. I know my limits.

I have heard tell that some people may consider audio books as invalid for the book challenge. In defence, I offer up that Goodreads doesn’t seem to feel this way in their blog post on the 2018 challenge. Besides, this isn’t the Olympics – at least no-one has sent me a gold medal yet – so everyone can make their own personal choice. For me, if it’s a method to get a book’s contents into my brain, I’ll happily take it. I just know I have to be very discriminating with regards to selecting audio books I can be sure I will be able to focus on. Even I would personally regard it cheating to log a book that happened to be audio-streaming in the background when I was asleep. If you don’t know what the book was about, you can’t count it.

So, what did I read about?

What did I read

Book topics are not always easy to categorise. The categories I used here are mainly the same as last year, based entirely on my 2-second opinion rather than any comprehensive Dewey Decimal-like system. This means some sort of subjectivity was necessary. Is a book on political philosophy regarded as politics or philosophy? Rather than spend too much time fretting about classification, I just made a call one way or the other. Refer to above comment re fun vs science.

The main changes I noted were indeed a move away from pure philosophical entries towards those of a political tone. Likewise, a new category entrant was seen this year in “health”. I developed an interest in improving one’s mental well-being via mindfulness and meditation type subjects, which led me to read a couple of books on this, as well as sleep, which I have classified as health.

Despite me continuing to subjectively feel that I read the large majority of books in eBook form, I actually moved even further away from that being true this year. Slightly under half were in that form. That decrease has largely been taken up by the afore-mentioned audio books, of which I apparently read (listened?) 10 this year. Similarly to last year, 2 of the audio entries were actually “Great Courses“, which are more like a sequence of university-style lectures, with an accompanying book containing notes and summaries.

My books have also been slightly less popular with the general Goodreads-rating audience this year, although not dramatically so.

Now, back to the subject of reading shorter books in order to make it easier to hit my target: the sheer sense of relief I felt when I finished book #50 and hence could go wild with relaxed, long and slow reading, made me concerned as to whether I had managed to beat that bias or not. I wondered whether as I got nearer to my target, the length of the books I selected might have risen, even though this was not my intention.

Below, the top chart shows that average page count by book completed on a monthly basis, year on year.

Book length ofer time


The 2016 data risks producing somewhat invalid conclusions, especially if interpreted without reference to the bottom “count of books” chart, mainly because of the existence of a  September 2016, a month where I read a single book that happened to be over 1,000 pages long.

I also hadn’t actually decided to participate in the book challenge at the start of 2016. I was logging my books, but just for fun (imagine that!). I don’t remember quite when it was suggested I should explicitly join then challenge, but before then it’s less likely I felt pressure to read faster or shorter.

Let’s look then only at 2017:

Book length ofer time2Sidenote: What happened in July?! I only read one book, and it wasn’t especially long. I can only assume Sally Scholz’s intro to feminism must have been particularly thought-provoking.

For reference, I hit book #50 in November this year. There does seem some suggestion in the data that indeed that I did read longer books as time went on, despite my mental disavowal of doing such.

Stats geeks might like to know that the line of best fit shown in the top chart above could be argued to represent that 30% of the variation in book length over time, with each month cumulatively adding on an estimate of an extra 14 pages above a base of 211 pages.  It should be stated that I didn’t spend too long considering the best model or fact-checking the relevant assumptions for this dataset. Instead just pressed “insert trend line” in Tableau and let it decide :).

I’m afraid the regression should not be considered as being traditionally statistically significant at the 0.05 level though, having a p-value of – wait for it – 0.06. Fortunately, for my intention to publish the above in Nature :), I think people are increasingly aware of the silliness of uncontextual hardline p-value criteria and/or publication bias.

Nonetheless, as I participate in the 2018 challenge – now at 52 books, properly one a week – I shall be conscious of this trend and double-up my efforts to keep reading based on quality rather than length. Of course, I remain very open – some might say hopeful! – that one sign of a quality author is that they can convey their material in a way that would be described as concise. You generous readers of my ramblings may detect some hypocrisy here.

For any really interested readers out there, you can once more see the full list of the books I read, plus links to the relevant Goodreads description pages, on the last tab of the interactive viz.

Characteristics of England’s secondary school teachers

In exploring the data behind England’s teacher supply model, it became apparent that the split of teachers by gender and age shows certain patterns by subject. Click through and use the below viz interactively to answer questions such as:

  • How many secondary school teachers are there in the UK?
  • What percentage of all teachers are female?
  • Are there certain subjects where females are over-represented in teachers vs others where males are over-represented? Have we overcome the historic gender stereotypes?
  • What proportion of teachers are below the age of 25? What subjects do they tend to teach?
  • What age-groups are particularly over-represented in females teaching art and design?
  • …and many more.

Use the first tab for a general overview and ranking of subjects on these indices; and the second tab to provide an easy comparison for your chosen subject vs all others.



How many teachers do we need? The official Governmental model

How do we know how many teachers are required to keep the UK’s schools in good working order? It’s an interesting question, with obvious implications for Governmental education policy with regards to teacher compensation, incentives, training places and so on.

The “official” requirements are calculated via the Government’s “Teacher Supply Model”, which, happily, in the name of transparency you can get a copy of here.

But rather than have to read through the 61 page user guide and two big fat Excel files, below are some basic notes on what factors go into its calculations. Most of this is summarised or reproduced from their manual (it is hefty, but I appreciate their openness in preparing it!).

Firstly we should define what exactly the model tries to calculate, and then predict.

Target variable

One of key outputs is to define the number of teacher training places required, so it works from a top down approach of “how many teachers do we want overall” to get to that figure.

The “how many teachers do we need to enter the profession each year?” is the focus of this post.

It’s referred to as “model part 1” in the official documentation. Model part 2 works from this to get to an actual number of NQT training places this requires, needed because there are other routes to increase teacher numbers outside of the typical NQT route.

Anyway, being such a model, part 1 necessarily involves a mix of data and assumptions.

It’s set up in a way that you can tweak the assumptions to show for instance the implications if greater or fewer teachers quit than expected. Where figures from this model are mentioned in future posts, from the default “central” scenario unless noted otherwise.

The authors are (very) keen to highlight that any assumptions made in the model do not equate or suggest knowledge of future Government policy! But rather it’s what these domain experts predict is likely to happen.

Model scope:

  • Only applies to England
  • Only concerns itself with qualified teachers
  • Includes state funded schools; primary (+ ones with nurseries attached), secondary, academies and free schools and key stage 5 (aka 6th-form) teaching within secondary schools.
  • Does not include special schools, referral units, independent schools, early years schools or standalone further/6th form colleges

Input variables used:

(*) wastage is the slightly unpleasant sounding term to mean teachers leaving for reasons other than death or retirement.

Assumptions implied:


The active stock of teachers in November 2014 (when the census is conducted) will not change significantly by the end of the 2014-15 academic year.

Teachers are categorised into what subject they actually teach, not what they were employed to teach. For example, if they are officially a science teacher, but spend 25% of the week teaching maths, then that’s 0.75 of a science teacher and 0.25 of a maths teacher.

Hours spent teaching PSHE are excluded.

Long term, the rate of change of key stage 5 pupil numbers will match the rate of change as the national 16-19 year old population. Short term, the same increases in post-16 participation from the past 3 years will continue.


The proportion of teachers who will leave as wastage going forward (per age group, per gender) is calculated from a weighted average of the wastage rates in the past 4 years worth of data.

This data is also broken down into groups of subjects (but not individual subjects).

The groups are as follows:

  1. Group 1: EBacc Science and Mathematics subjects – including Biology, Chemistry, Computing, Mathematics, and Physics.
  2. Group 2: EBacc non-Science and Mathematics subjects – including Classics, English, Geography, History, and Modern Foreign Languages.
  3. Group 3: All other subjects – including drama, music, PE, and RE among others.

The below table shows the assumed wastage rates based on subject/age. In general, group 1 subject teachers are more likely to leave than group 2, and then group 3 are the least likely to leave.

Projected wastage rates also factor in economic variables via the “econmetric wasteage model”, e.g. looking at historical relationships between teacher wastage and economic growth, unemployment etc,

Retirement / deaths

Uses weighted historical retirement and deaths in service rates from the past 4 years of data. Rates are calculated by age group by gender .

Model assumes retirement/deaths in service rates are the same in all subjects and over all future time periods (but it does take into account that some subjects tend to have older/different gender teachers than others and takes into account the projected changes in teacher demographics)

Method to estimate future teacher stocks needed

Start by projecting how the pupil teacher ratio will change going forward as pupil numbers change.

This is not as simple as “if the number of pupils doubles, so should the number of teachers”. They show via data that historically when pupil population increases, some of the extra was dealt with by increasing the pupil teacher ratio, as well as getting new teachers.

Therefore, model assumes for every 1% increase in pupil population from now, the pupil teacher ratio will increase by only 0.5 percentage points (primary) or 0.6 percentage points (secondary). It is however capped at a level that relates to previous historic maximums.

Here’s their “historical” chart on the subject:


Knowing future pupil numbers and future pupil-to-teacher ratios allows calculations of FTE teachers needed.

The model assumes that the ratio of unqualified teachers to qualified teachers will remain constant at today’s rates (by phase and subject). Demand met by unqualified teachers is therefore removed from this model.

FTE teacher requirements are converted into actual physical headcount, via multiplying by the current FTE rate for teachers (with implicit assumption that that ratio remains constant going forward).

Then to calculate teacher need by subject:

Calculate FTE rate based on current needs; e.g. if 10% of teaching time in secondary schools is spent teaching English (irrespective of how many English teachers there are) then 10% of the workforce needs to be English teachers.

Different subjects are more or less popular as pupil options at the distinct Key Stages, and the proportion of pupils in each Key Stage also changes.

The model therefore estimates the quantity of teaching time needed per pupil per subject at KS3, 4, 5 and scales upwards.

Then, add adjustment for anticipated education policies:

If any changes of educational policy might be expected to adjust the need for teachers by more than 100 FTE then it should be added to the model. Right now there are 7 such policies in the Secondary teacher section.

The policies they address are as follows:

  • Hold 2016-17 ITT places for all Ebacc subjects at 2015/16 levels or above (to support “Ebacc for all” policy).
  • Assume increases in Ebacc subject takeup.
  • Remove option to just take Core Science (Core Science is to be replaced by Combined Science, meaning that 10% of KS4 students will need double the science teaching time).
  • Add extra maths teaching time for a new core mathematics policy.
  • Assume continuing increases of uptakes for Maths and Further Maths A-levels due to enhanced further mathematics support progress.
  • Impact of new Maths GCSE will require a greater amount of Mathematics teaching per pupil at KS3 & 4.
  • Impact of new English GCSE will require a greater amount of English teaching per pupil at KS4.

Estimating new teachers needed to enter stock each year

Once one knows the above figures, it’s a simple calculation:

Need for entrant teachers in year x = Teacher need in year x  - Stock of teachers at the end of previous year + No. of teachers expected to leave in year x

This is calculated per subject per phase using variables on the left of the below diagram, iterating for years beyond 2016/17 with the process on the right of the diagram.


One key assumption is made here when it comes to modelling future years (2017-18 onwards).

The model assumes that if we determined we needed [x] new teachers by the end of year [y] then indeed that many will have been successfully acquired and added to the active stock. That will then be the starting stock for year [x+1].

There’s no consideration of events that would lead to fewer teachers being active in future years that the model says are required, e.g. if recruitment efforts fail.

That’s the end of part 1 of the model, which calculates the total and new teachers required.

Possibly more interesting will be to see the actual numbers behind the above calculations, which give indications of trends in KPIs affecting teacher requirements. More on that soon.