Prediction Analysis 2022

The Hottest 100 of 2022 was a very interesting year for 100 Warm Tunas. This year, we once again got to leverage the fact that Triple J played the Hottest 200 the day before the Hottest 100, allowing us to eliminate songs from the Hottest 100 which had ranked in the Hottest 200.

However, in addition to this, this year we introduced two new features into 100 Warm Tunas. The first being website submission, which allowed us to significantly grow the size of our sample collected by allowing site visitors to upload their votes directly to 100warmtunas.com. The second, was the introduction of an ML model to try to adjust the prediction to account for bias in the sample collected.

Summary

  • We collected 4,623 entries (62% increase since 2021 馃敽)
  • We tallied 41,731 votes across these entries (61.3% increase since 2021 馃敽)
  • triple j counted 2,436,565 votes.
  • Therefore, we collected a sample of 1.71%.
  • We successfully predicted #1
  • We predicted 7 out of the top 10 songs.
  • We predicted 14 out of the top 20 songs.
  • We predicted 84 out of the top 100 songs played in the countdown.
  • Throughout December and January, 100warmtunas.com was loaded over 228,000 times by 65,000 users.

Technical Analysis

Overview

This year, we were again able to successfully able to predict #1, with the assistance of an ML model applied to the data which we had collected. Let’s take a look at the top 10 of the official countdown and match it up with the predicted places in 100 Warm Tunas:

Artist Title ABC Rank Tunas Rank Difference
Flume Say Nothing [Ft. MAY-A] 1 1 0
Eliza Rose & Interplanetary Criminal B.O.T.A. (Baddest Of Them All) 2 7 5
Spacey Jane Hardlight 3 4 1
Steve Lacy Bad Habit 4 10 6
Spacey Jane It’s Been A Long Day 5 18 13
Spacey Jane Sitting Up 6 8 2
Lizzo About Damn Time 7 21 14
Ball Park Music Stars In My Eyes 8 3 5
Gang of Youths in the wake of your leave 9 2 7
Joji Glimpse of Us 10 12 2

Let’s pull apart this table a bit more and grab some statistics about how we did with our prediction overall:

Predicted Out Of Top N Percentage
7 10 70.0%
14 20 70.0%
22 30 73.3%
29 (*27) 40 72.5%
39 (*35) 50 78.0%
47 (*42) 60 78.3%
56 (*49) 70 80.0%
64 (*56) 80 80.0%
76 (*65) 90 84.4%
84 (*70) 100 84.0%
156 200 78.0%

From the above data, we can see that:

  • The average error for the top ten ranks was about 5 positions (slightly higher than 2021 and 2020 which were about 3 positions of error).
  • Warm Tunas predicted 7 out of the top 10 songs.
  • Warm Tunas predicted 14 out of the top 20 songs.
  • Warm Tunas predicted 84 out of the 100 songs played in the countdown.

(note, asterisked values represent prediction without top 200 elimination)

Making exact predictions is difficult, especially when you only have a small fraction of data, which has certain selection bias. The data we collect will generally give a good indication, but will never allow us to predict, with certainty, that a particular outcome will occur.

The following tables depict positional accuracy and error breakdown exhibited across different cross-sections of the countdown:

Positional Accuracy of the Top 10:

Positions Off Num Occurences % Positions Off (Cumulative) Ocurrences Cumulative %
0-4 4 40% 0-4 4 40%
5-9 4 40% 0-9 8 80%
10-14 2 20% 0-14 10 10

Positional Accuracy of the Top 20:

Across the top 20, 70% of our predictions were accurate within 9 places of their actual position in the countdown.

Positions Off Num Occurences % Positions Off (Cumulative) Ocurrences Cumulative %
0-9 14 70% 0-9 14 70%
10-19 4 20% 0-19 18 90%
40-49 1 5% 0-49 19 95%
> 80 1 5% > 0 20 100%

Positional Accuracy of the Top 50:

The trend continues across the top 50, with 50% of our predictions being accurate within 9 places of their actual position in the countdown and 72% being accurate within 19 places. This shows that 100 Warm Tunas is still reasonably good at predicting a general guess of what songs will play during the countdown, but not necessarily their exact order.

Positions Off Num Occurences % Positions Off (Cumulative) Ocurrences Cumulative %
0-9 25 50% 0-9 25 50%
10-19 11 22% 0-19 36 72%
20-29 2 4% 0-29 38 76%
30-39 2 4% 0-39 40 80%
40-49 4 8% 0-49 44 88%
50-59 1 2% 0-59 45 90%
60-69 0 0% 0-69 45 90%
70-79 1 2% 0-79 46 92%
> 80 2 4% > 0 48 96%
Did not rank 2 4% - - -

Positional Accuracy of the Top 100:

For completeness, the accuracy bands of the top 100 have also been provided:

Positions Off Num Occurences % Positions Off (Cumulative) Ocurrences Cumulative %
0-9 32 32% 0-9 32 32%
10-19 18 18% 0-19 50 50%
20-29 10 10% 0-29 60 60%
30-39 7 7% 0-39 67 67%
40-49 11 11% 0-49 78 78%
50-59 3 3% 0-59 81 81%
60-69 2 2% 0-69 83 83%
70-79 2 2% 0-79 85 85%
> 80 5 5% > 0 90 90%
Did not rank 10 10% - - -

Accuracy Deep Dive: Is accuracy decreasing over time?

A comment that is made often about 100 Warm Tunas (especially on Hottest 100 Day), is that the predictions are getting worse and are “way off”. This also isn’t new to just 2022’s countdown predictions. To initially address this point, 100 Warm Tunas was, and never will be about providing predictions of exact positions. Whilst our data presentation on our website does display an ordering of songs and their predicted positions from 1-100 (and 101-200), this is provided as an indication only. From the tables in the previous section, and from all analysis in the past, we know there is inherent error in predicting exact positions.

However, in saying this, we believe it’s important to provide an analysis of 100 Warm Tunas’ accuracy over the last 6 years. In order to do this, we need to create a way to measure of “accuracy”. For the sake of this analysis, we want to compare both;

  • Positional Accuracy (that is, how closely 100 Warm Tunas can predict exact positions)
  • Bucketed Accuracy (that is, how many songs 100 Warm Tunas can predict in specific buckets, ignoring order, e.g. 7 of top 10, 82 of top 100)

Positional Accuracy

To measure positional accuracy of 100 Warm Tunas’ predictions, we will take the mean and median of the positional error for every year of data. To provide a more interesting analysis, we will make this calculation for the Top 10, Top 20, Top 50, and Top 100. To provide a fair comparison across all years, top 200 elimination is not used.

Mean Error, Without Top 200 Elimination:

Year 系 Top 10 系 Top 20 系 Top 50 系 Top 100
2022 5.5 15.3 40.2 49.9
2021 3.1 27 50.3 58.5
2020 3 11.2 22.1 33.3
2019 2.9 10.4 21.0 31.6
2018 7 12.5 15.5 27.4
2017 2.1 4.8 10.1 18.1

Median Error, Without Top 200 Elimination:

Year 系 Top 10 系 Top 20 系 Top 50 系 Top 100
2022 5.0 6.5 9.5 19.5
2021 3.5 4.0 11.5 22.0
2020 2.0 3.5 19.0 26.0
2019 2.5 5.0 11.5 17.0
2018 3.5 4.5 10.0 16.5
2017 2.0 3.0 6.0 11.0

From the above data and visualisation we can deduce that there is an upward trend in error of the Top 10 and Top 20. However, it can be observed that compared to 2020, and 2021, the median error of 2022 in the Top 50 and Top 100 has actually reduced.

Bucketed Accuracy

To measure bucketed accuracy, we will count the number of songs predicted in the correct bucket. For the sake of simplicity, we will only use the following buckets: 1-10, 1-20, 1-50, 1-100.

Year #1-1 # 1-10 # 1-20 # 1-50 # 1-100
2022 1/1 7/10 14/20 39/50 (*35/50) 84/100 (*70/100)
2021 1/1 8/10 14/20 35/50 (*34/50) 82/100 (*73/100)
2020 1/1 8/10 12/20 33/50 75/100
2019 0/1 8/10 14/20 38/50 73/100
2018 1/1 7/10 15/20 37/50 83/100
2017 1/1 8/10 16/20 42/50 83/100

(note, asterisked values represent prediction without top 200 elimination)

From this data, we can deduce that the general bucketed accuracy of 100 Warm Tunas has not drastically changed since 2019. Even without Top 200 elimination, we can still achieve 35/50 and 70/100, which is generally on-trend with the last 4 years.

We can also visualise these trends over time with a stacked bar chart (showing results without top 200 elimination):

Data Volume over time:

Finally, a “data over time” analysis wouldn’t be complete without looking at trends of the volume of data which we collect and compare that to the number of votes counted by Triple J.

Year Tunas Votes ABC Votes Sample Size
2022 41,731 2,436,565 1.71%
2021 25,877 2,500,409 1.03%
2020 36,156 2,790,224 1.30%
2019 45,112 3,211,596 1.40%
2018 58,463 2,758,584 2.12%
2017 67,085 2,386,133 2.81%
2016 65,412 2,250,000 ~2.91%

One thing that initially stands out is the fact that Triple J’s number of votes counted has been decreasing since 2019. Is the Hottest 100 becoming less popular these days?

As for 100 Warm Tunas, we can observe that the data collected has been trending downward. Fortunately, this year we were able to buck the trend of a decreasing sample size, and managed to collect 61% more data than 2021, giving a sample larger than 2021, 2020, and 2019.

ML Performance Analysis

This year, we introduced an ML Model to adjust vote counts to reorder the prediction based on historical trends learned from our data for all previous predictions.

In this section we will analyse the performance of the model and determine whether it improved the outcome and accuracy of the prediction.

Artist Title ABC Rank Tunas Rank (no ML) Tunas Rank (ML)
Flume Say Nothing [Ft. MAY-A] 1 3 1
Eliza Rose B.O.T.A. (Baddest Of Them All) 2 9 7
Spacey Jane Hardlight 3 4 4
Steve Lacy Bad Habit 4 14 10
Spacey Jane It’s Been A Long Day 5 16 18
Spacey Jane Sitting Up 6 6 8
Lizzo About Damn Time 7 20 21
Ball Park Music Stars In My Eyes 8 2 3
Gang of Youths in the wake of your leave 9 1 2
Joji Glimpse of Us 10 17 12

When comparing ML vs no ML, we initially see that:

  • When ML is not used, 100 Warm Tunas does not successfully predict #1.
  • When ML is not used, 100 Warm Tunas only predicts 6 of the top 10 (vs 7 with ML).
  • When ML is used, some Spacey Jane songs are incorrectly down-ranked (Sitting Up 6 鈫 8). The same applies to Lizzo’s ‘About Damn Time’ (20 鈫 21).

Positional Accuracy

Mean Error

ML/No ML 系 Top 10 系 Top 20 系 Top 50 系 Top 100
No ML 6.5 12.8 31.7 34.1
ML 5.5 12.2 31.1 33.8

Median Error

ML/No ML 系 Top 10 系 Top 20 系 Top 50 系 Top 100
No ML 7.0 7.5 8.5 16.5
ML 5.0 6.0 8.0 15.5

From this data, we can deduce that ML brings a general improvement to the prediction, since the error for every bucket has been reduced when ML adjustment is used.

Bucketed Accuracy

No ML ML Out Of Top N
6 7 10
15 14 20
22 22 30
29 29 40
39 39 50
84 84 100

As we have already seen, using ML allows us to predict 7 of the top 10. Interestingly, using ML decreases the bucketed accuracy of the top 20, reducing it by 1, to 14/20. The remainder of the buckets maintain the same outcome, and this is expected, as we only apply the model to the top 25 predictions.

Vote Sources

100 Warm Tunas collects votes from a handful of different sources. This year we introduced the ability for users to upload their votes directly to the website. Here is the break-down of how many votes were counted across the different sources:

Source Num Entries Num Votes Votes Per Entry
Instagram DM 321 2983 9.2928
Instagram Story 240 2009 8.3708
Instagram Feed 871 7780 8.9323
Reddit 226 1999 8.8451
Twitter 249 2121 8.5181
Website Upload 3143 28741 9.1444
Total 5050 45633 9.0362
Total (after de-duplication) 4623 41731 9.0268

This year we collected a majority of our data through direct upload from our website.

A question that we’d like to ask ourselves is “Which is the most accurate source, or combination of sources?”. We can determine this by doing both a positional and bucketed accuracy analysis on every combination of vote sources.

Accuracy by Source

From the data, we can deduce the following:

  • The best top 10 source is “IG DM”, with a median error of 3.5 (Baseline is 5). Using just this source also predicts 3 of the top 5.
  • The worst top 10 and top 100 source is “Reddit Comment”, predicting only 59 of the top 100 (Baseline is 84)

The tables above are interactive. You can use the column headers to sort the data to make your own deductions.

Vote Submissions

It comes as no surprise that the most popular day for people to share their votes was on the day voting closed with 287 entries collected:

We can further break down the data by source and apply a cumulative sum to visualise it in a “race” format:

Finally, we can then break down the votes collected for each song over time:

Vote Counts Per Day (with ML adjustments)

Vote Counts Per Day (without ML, raw vote counts)

Wrap Up

This year was an absolute blast - we got to experiment with new technology (ML Model), and successfully predicted #1.

We’d like to thank everyone who shared their votes, visited our site or interacted with us on Instagram. Without our audience, 100 Warm Tunas wouldn’t be possible. See you next year!

If you enjoyed this analysis and would like to show your appreciation for the work that I do, you can show your appreciation by buying me a coffee 鈽曪笍.