Solving for X (Factor)

[With the X Factor back on our screens, we’re delighted to present a guest post by Joy Leahy, statistics PhD student at Trinity College Dublin and part-time lecturer at Dublin City University and Institute of Technology Tallaght. Joy brings academic rigour to some common subjects for Sofabet discussion: the correlation between the public vote and factors such as the running order in live shows, the sympathy bounce, and the amount of screentime in auditions – Daniel.]

X Factor has had a huge influence on modern music since it began twelve years ago. Successful contestants have gone on to release 37 UK number ones, including seven Christmas numbers ones. It has given us superstars such as Leona Lewis and One Direction, but most importantly the show itself has given us many hours of entertainment. The tears, the drama, the sob stories, and the occasionally decent singing have succeeded in ruining Saturday nights for millions of boyfriends across the country! Of course, what keeps the nation gripped is that the public controls the fate of X Factor contestants… or do they?

In one of the most dramatic moments of the 2015 series, contestant Mason Noise got into a heated argument with show boss Simon Cowell, as Noise criticised the lack of air time his audition received in comparison to other contestants. We were shown a 47 second clip of Noise. In comparison, fellow contestant Anton Stephans received 10 times as much airtime, when we learned about his past career as a backing singer and even met his incredibly adorable dog Honey! In fact, by my stopwatch, Honey had approximately the same amount of airtime as Noise. But did this disadvantage Noise? The answer lies in a careful analysis of the show’s statistics.

The data

X Factor doesn’t release any data during the series with regards to voting results. This adds to the suspense of who will finally be crowned the winner and perhaps also ensures that people will continue voting for their favourites, instead of those hovering near the bottom. From 2008, changes to OFCOM rules meant that X Factor UK released the exact percentage of the vote received by each X Factor contestant by week at the end of each series. This makes for a data set ripe for statistical analysis, as it allowed me to analyse the last eight years of voting data in depth. Contestant finishing order was also available on Wikipedia.

I had a fun number of days looking back over past X Factor episodes to find out how long each contestant’s first audition was. I excluded groups which were manufactured or altered on the show (One Direction, Little Mix etc) as these did not actually have a first audition and I didn’t want this to skew the data. I then calculated what percentage of the average airtime each contestant received within each series. For example, Mason had 14% of the average, while Anton had 138% of the average. On that basis alone I think I might have been annoyed too.

Figure_1_Time_vs_Overall_Finishing_Position

Lasting First Impressions

I first looked at whether we could use the first audition time to predict the position in which the contestant would finish. I used a general linear model with finishing position dependent on time and found it to be a significant explanatory variable. We can see this plotted in figure 1 which clearly shows a trend-line indicating that the longer airtime in the initial segment does in fact predict a higher finishing position.

I next broke this down per week to see if we could predict the percentage vote per week based on the first audition time. As the percentage share of the vote is only available from series five, I restricted my data to look at series five to series twelve. Figure 2 shows the voting share in week one versus the proportion of first audition airtime for each contestant. We can clearly see a trend line showing that first audition and airtime are positively correlated.

Figure_2_First_Audition_Airtime_vs_Vote_Share_in_Week_1

I ran a linear regression as follows:

Voting Percentage = β0 + First Audition Airtime

Here the percentage of the vote that each contestant receives is based on the airtime of the first audition and some other unknowns β0. We now need to figure out how much is based on each. Results of the regression for each week are shown in table 1.

β0 Airtime
Week Estimate Estimate p-value
1 0.39 0.62 0.00
2 0.73 0.28 0.12
3 0.70 0.32 0.04
4 0.63 0.38 0.04
5 0.92 0.08 0.7
6 0.99 0.00 1.00
7 0.55 0.41 0.06
8 0.96 0.06 0.80
9 0.92 0.08 0.77
10 0.88 0.12 0.64

Table 1: Regression Results: Airtime vs Vote Share

For week one a contestant whose first audition is not shown is expected to get 39% of the mean vote. For every extra average first audition airtime they are expected to receive an extra 62% of the mean vote. Therefore, if a contestant’s first audition airtime is twice as long as the average they will be expected to receive 163% of the mean vote. Note that for each week the β0 and airtime should sum to approximately one, since if all contestants were equal we would expect them all to receive the mean vote. First audition airtime seems to be important for the first four weeks of the show, but perhaps on a decreasing scale week by week. After this time, it seems that the public are not as influenced by the initial impression, as it gets diluted by more recent performances. However, we can’t say for certain whether it still makes a difference in the later stages of the show as the number of data points are decreasing each week.

Of course, it is highly likely that the producers will have a good idea of which contestants will be popular and therefore allocate more airtime to contestants who would have attracted a higher percentage of the vote even with equal first auditions. However, it does appear that the early rounds of voting are impacted by first audition. This can be seen by anecdotal evidence for Only the Young, Olly Murs, JLS and Fleur East who all had less than average airtime for their first audition, but whose popularity grew as the competition went on, presumably due to strong performances or likeable personalities. It seems plausible that even if a contestant is quite weak, having a long first audition could build up enough support to see the act through the early weeks while viewers are still trying to distinguish between the better singers who have still not shown their personality.

A Hard Act to…Precede

However, there must be other things the producers can do to influence voting. After all, not everyone watches or even remembers the earlier rounds of the competition. Page and Page analysed rankings from both Pop Idol and X Factor in various countries from 2002-2007. They showed that the running order of the show influences the voting, with contestants performing later in the show more likely to rank higher in the voting. However, they did not have the benefit of the exact voting percentages we have today.

Firstly, I analysed the impact of position by looking at the exact voting numbers. I grouped all weeks which had each contestant singing once. As there was a differing number of contestants in each show I used percentage position in the show instead of absolute position. The result for each contestant was calculated as follows:

(percentage of average vote received in current week)/(average of percentage of average vote received in all previous weeks) × 100%

My results agreed with Page and Page that the later contestants more likely to get a higher percentage of the vote.

A plot of result vs position is shown in figure 3.

Figure_3_Result_vs_Position

An interesting sidenote is the outlier coloured in red. This is Rachel Adedeji in week three. Adedeji received a measly average of 34% of the mean vote in weeks one and two but received 154% of the mean vote in week three. So what is happening here? Perhaps this could be explained by a so-called voting “bounce”. While the public are not aware of the exact voting breakdown, they do know who was in the bottom two sing off each week. The theory is that the saved act will receive a higher percentage of the vote the following week as the public know that they are in need of votes to stay in the competition. While this is an extreme example of the “bounce” I think this also needs to be included in the model.

I also wanted to investigate if strategic placing of contestants relative to each other impacts voting. I looked at the average amount of previous votes that the contestant immediately before (named “Before”) and after (“After”) received. I ran another linear regression with position, Before and After in the model, as well as the “bounce”.

Result vs Expected = β+ Position + Before + After + Bottom 2

Here we think that the voting percentage will be based on your position, the strength of the contestant before you and the strength of the contestant after you, whether you were in the bottom two the previous week, as well as other unspecified factors known as β0.

You can compare regression models using the Akaike Information Criterion (AIC). This compared the goodness of fit of each of the models, while penalising overly complex models. The statistical package R has a useful Step AIC function to help identify the best model. This concluded that we should keep the terms “Position”, “After” and “Bottom2” in our model. This means that it doesn’t really matter who performs before an act, but a good contestant performing immediately afterwards can really hurt the preceding contestant’s chances.

Result vs Expected = β+ Position + After + Bottom 2

The results are shown in table 2.

Term Estimate p-value
β0 0.72 0
Position 0.49 0
After -0.07 0.01
Bottom2 0.40 0

Table 2: Regression Results: Expected results

Before accounting for other factors in the model, a contestant performing last will be expected to receive 121% (72%+49%) of their otherwise expected vote. Being before a contestant who got the mean vote will lose you 7% of your otherwise expected vote. If you were in the bottom two the previous week you should expect an increase of 40% of your otherwise expected vote.

It would have been interesting to include initial airtime in this model as well. However, as I was comparing voting numbers from previous weeks it would not have made sense to include it as initial airtime would have been accounted for within each week of voting.

There are a number of other factors under the producers’ control that could also influence the voting. Song choice, emotional stories leaked to the press, judges’ comments and overall airtime are all example of factors that could be worth analysing. However, this analysis indicates that the producers have the ability to influence voting.

The Hidden Factor

So can we identify any contestants who were sabotaged? Well it’s hard to say for certain, especially in the first few weeks when we have very few data points for each contestant. However, out of all the contestants in my analysis who made it on the live tour (i.e. final seven or eight) Mason Noise came out worst in the running order (position and contestant after him). In addition, he had to sing Justin Beiber’s “Sorry” to the audience on the first live show and every night on the X Factor tour. So what can we learn from this? It seems it’s quite bad to have a short first audition airtime, but given the potential tricks up the sleeves of the producers, it is even worse to make an enemy of Simon Cowell!

References

Page L, Page K. Last shall be first: A field study of biases in sequential performance evaluation on the Idol series. Journal of economic behavior and organization. 2010 Feb 28;73(2):186-98.

15 comments to Solving for X (Factor)

  • Jessica Hamby

    I thought Emily was great. An object lesson in how to get an audience to fall in love with you. A performance brimming with subtlety and nuance, patient and measured. It was probably a good thing Simon stopped her when he did. You don’t want to show everything you’ve got at your first audition if you don’t have to (obv that assumes that she has more in the locker – I hope so).

  • Martin

    Interesting and timely article! It’s nice to have confirmation of information we speculate about, as well as anomalies and exceptions to the rule.

    In terms of this weekend, a few thoughts:

    – RE: spending pretty much an entire weekend in Scotland for auditions. Is the train of thought here ‘fine we’ll hold auditions in Scotland but we’ll split their vote across four contestants who all seem marketable to us, that’ll show them’

    – Yes Lad seem a viable (if a bit more District 3 than 1D) boyband, surprised to see them opening the first show and not through. Every act we see with considerable audition time is going to be subject to ‘are they the wild card?!?’ speculation but they’re on Xtra Factor again tonight and JH is already done with. Emily Middlemas and James from the boys are also on tonight.

    – in the same vein, surprised to see two pimp slots going to two acts who don’t go through to judges houses. I remember Rebekah Ryan from when she had a record deal and that may not be a particularly strong story but she has the single parent angle and the tragic story of her losing her son. I can’t see them wasting such a powerful back story, and a Tamworth wildcard has happened before…

    It’s been a slow weekend – nobody has blown me away apart from Nicole Scherzinger. Xtra Factor is also strangely self aware and very very funny, if lacking any betting relevance this year.

    • Jessica Hamby

      If we assume the franchise comes first it makes sense and after last year they have a bit of work to do.

      Fwiw I thought the last act, as well as being totes emoshunal, was kindly treated. Maybe she got something good from that.

    • 360

      Rebekah Ryan – which I don’t think they mentioned – has also been on XF before – in 2007 she made the top 12 in the Overs according to wikipedia. Makes you wonder if they’ve offered her a deal to get to lives in order to come back this year – with her tragic backstory that would seem likely. Into lives and out week 1 but with promo in her back pocket, is one route I can see them taking with her.

  • Rose Lloyd

    Emily Middlemas this year reminded me a bit of Abi Alton, with the judges recognising her as an artist with a pre-formed style. She is a bit less fragile but there’s something that reminds me of Abi in her performance. I do hope this is as far as that comparison goes!

    • Martin

      There were shades of Abi in Emily’s audition, most notably the white guitar! I think Emily’s voice is stronger than Abi’s ever was, and her overall act seems less niche than the usual alternative girl we have. She seems to lie somewhere between Abi Alton and Ella Henderson. As Jessica says above, she came across very well but I’m doubtful that we’ve seen this years winner so far.

  • Cath

    Joy – great article! Thanks for sharing your analysis. It’s very interesting to see how the accepted wisdom stacks up against the stats.

    As for the shows this weekend, same old same old, I’ve seen nothing that suggests they’ll be able to revive the show and it’s ratings this year. Seen it all, sob stories and all, many times before. As usual, it will just be Sofabet that keeps me watching!

  • Boki

    Thanks for the interesting article, always nice to look into some stats with a “scientific approach”.
    Now I have another question in relation to 1st audition duration: how important are Youtube audition clips on the official X-factor channel?
    These clips are mostly edited and only some of them show the full video from the audition episode. Some of them have a millions of views so I would say they have a big influence, but can the difference between the aired and YT duration give us a clue for the intended trajectories?
    Here is the comparison for last weekend shows airtime vs YT (only the important ones, roughly rounded,):

    yes lad 6:00 2:45
    jamesW 6:45 3:30
    christian 8:30 5:45
    caitlyn 6:00 2:30
    honey g 6:15 5:45

    eddie 4:30 3:00
    saara 6:30 6:30
    jamesH 4:00 1:45
    emily 5:00 1:45
    rebekah 9:00 9:00

    • 360

      Interesting stats, given that from what spoilers seem to show so far, of the longest YT clips there, they offload ALL of Saara, Rebekah and Honey G (why? she’s literally a ready made joke act) before Judges’ Houses.

      That might suggest that long clips aren’t as much of an advantage. I would guess perhaps that the longer clips are there to advertise and get viewers invested in the show itself and the judges, while the shorter clips exist for us to focus on the contestants themselves and get invested in them. Isn’t a selling point of YouTube that you can quickly consume the content you want to, without all the distractions and padding actually watching the show would have?

      • Boki

        It looks like the pimp slot clips are longer than the rest but in order to have a proper analysis someone would have to go through the previous years and compare…

        Btw, Joy (article author) says “I had a fun number of days looking back over past X Factor episodes to find out how long each contestant’s first audition was.” Does it mean he has the numbers from the each episode (if yes publish them please 🙂 ) or his “source material” was YT itself?

    • Joy

      Thanks Boki for pointing that out. I actually used the YouTube clips and just excluded any preamble at the start. If there were room and stage auditions I added the two together. It’s definitely worth looking at the differences between the two though!

  • Piresistable

    Perhaps we need some regression analysis on the effect of red and black staging?!

    Seriously, though, very interesting. It’s worth noting that the voting system has varied through the years in terms of when the phone lines are open (I can’t remember which years are which, though). This might make a difference. The other thing to chuck into the mix would be advert breaks. That particularly helps with memory holing candidates preceding big names.

    • Joy

      I think it would be really interesting to look at the ad breaks! Definitely a lot that they can play around with there. If anyone has the data on those please let me know!

 Leave a reply...