How Random is the JetPunk Randomizer?

+42

Introduction

In March 2019, Stewart uploaded the Random Countries on the World Map quiz. In this quiz, 20 random countries on the JetPunk world map are highlighted. As of January 2022 this quiz has almost 500,000 takes. Nevertheless, complaints have been voiced in the comment section that the JetPunk randomizer is not random, as it chooses bordering countries or appears to heavily favour some regions. Therefore we will investigate in this blog just how big the chances on these things happening are!

Just a warning beforehand: this blog contains more mathematics than what you're used to from the blog section. I will attempt to explain everything clearly and light-heartedly, but if it's too much feel free to just scroll to the charts at the bottom.

How many options are there?

We know there are 196 countries in the world, and we want to select 20 of them at random. In how many ways can we do that?

To understand this, let's have a look at the urn on the right. There are 6 balls inside it, numbered 1 to 6. In how many different ways can we select 2 balls from this?
Let's pick the first ball. There are 6 options for this: we can pick any of the 6 balls. There are now 5 balls left in the urn.
Let's pick the second ball. Regardless of which ball we picked first, there's 5 options to pick left in the urn. This means there are 6·5=30 different pairs of balls possible. With just 6 balls, the number of possible pairs is already large.
However, there's a catch. If we first picked ball 1, and then ball 2, that's one of the options. But if we first picked ball 2, and then ball 1, that's also among the 30 options, yet the result is the same: we have balls 1 and 2. Therefore we still have to divide by 2 to get the amount of unique sets. There are thus 15 ways to get 2 balls out of this urn.
The urn problem

Now, mathematics wouldn't be mathematics if they had not generalized this problem. This is called the nCr function (pronounce n choose r). The urn problem above can be written as 6 choose 2, and this will neatly give us 15. The 6 is the number of balls in the urn, the 2 is how many we choose. If you want to understand why this does what it does, check out this Wikipedia page (Warning: there is a lot of detailed mathematics here!), but this is too detailed for this blog.

So, back to JetPunk. We have 196 countries and want to choose choose 20 of them. This means we want to calculate 196 choose 20, which equals 1,055,107,996,806,619,641,006,784,512. That's 1 octillion options, or 10²⁷ in scientific notation.

How big is that number?

One octillion is so large we cannot imagine it any more. So here's a few comparisons to make you understand just how large this number is:
  • 1027 kilograms is about 60% of the weight of Jupiter.
  • If you shower for 3.5 minutes, approximately 1027 molecules of water have come out of the shower head (about 31 liters).
  • The longest novel ever written contained 9.6 million characters. If each of the 6.4 million English wikipedia pages was as long as that novel, the totality of this new ultra-long Wikipedia would contain 6.2·1013 characters. That is still not even close to 10²⁷. However, if you would replace every character on the ultra-long Wikipedia with the entirety of the ultra-long Wikipedia, you would end up with about 3·10²⁷ characters, or about 3 octillion characters.
  • How far would 1027 metres be?
    • The circumference of the Earth is only 4·107 metres. We need to go bigger.
    • The distance from the Earth to the Sun is 1.5·1011 metres. That's not nearly enough.
    • The distance from the Sun to Pluto is about 5.9·1012 metres. One octillion metres dwarfs our solar system.
    • The distance to the closest star (Proxima Centauri) is about 4·1016 metres. It takes light about 4.2 years to travel this distance, but we're not even close yet.
    • The closest other galaxy then? The Canis Major Dwarf Galaxy is about 2.4·1020 metres away. We need to go about 5,000,000 times that distance and then we're there!
    • The diameter of the observable universe is about as far as we can go. This is 8.8·1026 metres. So 1 octillion metres is about 1.2 times the diameter of the observable universe. It would take light about 223 billion years to travel this distance. That's gonna be a long hike.

How can we calculate the chance of getting a bordering country?

There are two options for simulating the chance of having two bordering countries amongst the 20 selected countries. The first one is the easiest: brute-forcing. The idea behind brute-forcing is to go through every single option there is, and simply count how many times we have two bordering countries. Sounds like a plan! But wait... we had one octillion options. How long will that take?

To test this, I wrote a script that would just try to brute-force its way through all octillion options, and just see how long it would take. I could test 30,000 options every second this way. However, that's not nearly enough to make a dent. In 4.5 billion years from now, when the Sun will explode at the end of its lifetime, this code would have tested about 0.00001% of all options. It would take about 11,000,000 times the remaining lifetime of the Sun to finish. So grab a snack, we'll be here for a while!

The Sun, our friendly neighbourhood star

Since brute-forcing is not an option, we have to resort to the second option: a Monte-Carlo simulation. Named after the famous casino in Monaco, the idea is to generate a lot of random options which are then representative of the final answer. So from our 1 octillion options we randomly pick 1 million (which we can brute-force through), and use those 1 million to calculate the percentage chance on two bordering countries being in any option.

Since this is of course a random process, the percentages will change if you pick a different set of 1 million options. Therefore we will repeat this process 12 times, thus essentially testing 12 million options. If all goes well the percentages shouldn't change too much between the different sets of 1 million options.

The chance of getting a bordering country

The results of the Monte Carlo simulation are shown below. Here we immediately can see that the chance on no bordering countries in a set of 20 is only about 5%19 out of 20 times you will play Random Countries on the World Map you are expected to end up with at least 1 border between the 20 countries. 3 out of 4 times you are expected to end up with between 1 and 4 borders. Ending up with a set with 2 borders is the most common, this happens 21% of the time.

Ending up with bordering countries is thus not due to the JetPunk randomizer not being random: on average only 1 in 20 times you play you will not have a single border between the selected countries.

Let us take this one step further even. Let's say we pick one country that will be in the set of 20 countries, and then randomly pick the other 19. What is the chance on having a border in the set then? 

This of course depends on the amount of borders that the country we picked has. If we pick Australia, there's no way we are gonna get a border with our pick, so we have to get a border with the 19 remaining ones that were randomly picked. On the other hand, if we pick Russia, any of the 14 countries bordering Russia ending up in the set is enough already. This is all shown in the graph below. Each country is represented by 12 dots, one for each set of 1 million options.

Indeed we can see that this depends heavily on the amount of borders the selected country has. If we select Australia there's about a 93.5% chance on getting a border in that option, whereas if we select Russia that chance is 98.4%. The spread here is for the most part caused by the randomness of the options we select: picking any country with 0 borders should not make a difference.

The one exception is Saudi Arabia, marked in red. Saudi Arabia borders Qatar with 1 border, and the United Arab Emirates, Kuwait and Yemen with 2 borders. This means that for these 4 countries the chance of creating a border is drastically increased, since they are not dependent on one or two other countries also being selected. For the other countries with 7 borders this is not the case since they just border countries with a lot of borders themselves.

Using the spread of the percentages of the countries with 0 borders (which should all be the same since, well, they have no borders) we can get an error estimate on our conclusion that 95% of the options has at least one border. How exactly this works is too detailed for this blog, but the final estimate is that 94.99±0.24% of the options has at least one border with 99% confidence.

99% confidence means that the chance is about 1% that the actual percentage of all 1 octillion options that have at least one border is not between 94.75% and 95.23%. I'd say that's worth it in exchange for not having to wait for the Sun to explode 11,000,000 times.

What is the maximum amount of borders you can get?

We have now looked at what the chances are of getting no borders at all between your set of countries. But what if we reverse that problem: by putting all countries next to each other, what is the maximum amount of borders you can get?

There are three obvious areas to try this: Asia, Central Africa and Southeastern Europe.

Asia

The first area we can try is Asia. It turns out that with the 19 green countries on the right you can create a set with 44 borders. By adding any of the 6 yellow countries, two more borders are added for a grand total of 46.

The 46 border options in Asia (and partially Europe). The 19 green countries are required, and any of the 6 yellow countries should be added to make the full set of 20.

Africa

In central and southern Africa we can do something similar, and it turns out there are two distinct options that again result in 46 borders. The left option requires three of the four yellow countries to be added, the right option one of the five. Note that selecting Malawi, Burundi, and Ethiopia in the left image, and Cameroon in the right will lead to the same solution. There are therefore 8 distinct options here.

Option 1 for 46 borders in Africa. The 17 green countries are required, and 3 of the 4 yellow countries should be added.
Option 2 for 46 borders in Africa. The 19 green countries are required, and any of the yellow countries should be added.

Europe

Around the Balkan we can again select 20 countries, yet whatever we tried, we could not get past 45 borders here. For 45 borders there are too many options to show here though.

The maximum number of borders thus is 46, which can be achieved in 14 different ways: 8 in Africa, and 6 in Asia (and a bit of Europe).

The chance of X amount of countries from a continent

The final often-heard complaint is that the randomizer seems to favour certain regions like Oceania or the Caribbean. We can of course also calculate the chance of getting for example 10 countries in Oceania.

So let us go back to the urn. This time there's no numbered balls, but coloured balls! You can imagine Red = Africa, Blue = Asia and Green = Europe to see how this will work.
Now what is the chance that we have 1 blue ball if we draw 2 balls?
There's two ways this can be achieved: first draw a blue ball, then a non-blue ball, or first draw a non-blue ball, and then a blue ball.
The first option: the chance that we first draw a blue ball is 2/6. The chance that we draw a non-blue ball with one blue ball removed from the urn is 4/5. Multiplying these two gives 8/30 chance that this happens.
The second option: the chance that we first draw a non-blue ball is 4/6. The chance that we then draw a blue ball with one non-blue ball removed from the urn is 2/5. Multiplying these two gives again 8/30 chance that this happens.
The total chance that this happens then is these two added together: 16/30 .
The urn problem again, but this time with coloured balls!

Some combinations will obviously be impossible: it is impossible to get 2 green balls in 2 draws, since there simply is only one green ball present. The chance on this will then be 0.

Mathematics has also generalized this problem into a fixed equation, but that equation is too complicated for this blog. More information can be found on this Wikipedia page (warning: difficult mathematics ahead!). Here I will just show the results. Note that South America and the Caribbean have the same number of countries, thus the chances for each number of countries are the exact same.

We can see that the chances on getting 10 countries from Oceania is basically 0. In fact, this would only happen once in every 150 million takes. That's about 149.5 million more takes for that to have probably happened. 8 countries from Oceania, on the other hand, happens once every 200,000 takes. This should have happened 2 to 3 times by now.

The chance on getting 10 African countries, on the other hand, is about 1.5%. So once every 70 takes someone will end up with a very Africa-centric quiz.

Conclusion

Given all this, can we say that the JetPunk randomizer is truly random? I would say so, since all arguments given on why it should not be random can be debunked using mathematics. To list them all again:
  • There are over 1 octillion options to choose 20 countries out of the 196 we have!
  • The randomizer picks countries bordering each other. It indeed does, and statistically it should do so 94.99±0.24% of the time. Thus only 1 in 20 times you play the quiz you should end up with a borderless set.
  • The randomizer puts a lot in Africa. Africa has the most countries, so indeed most should be in Africa on average. 1 in 70 quiz takers will end up with 10 African countries!
  • Given the 500,000 takes, even 8 out of 20 in Oceania should have happened 2 or 3 times by now.

Finally, if you get really lucky (or unlucky, depending on how you see it), you can get a set with 46 borders between the 20 countries. There's even 14 options to do so! Good luck finding one of them though, remember that there are 1 octillion options :).


Many thanks to Stewart for the discussions we had on this and helping with finding the maximum amount of borders possible!

29 Comments
+6
Level 36
Jan 20, 2022
So much effort and facts put into this!

Very interesting blog- very entertaining

Can't wait for more!

This deserves top spot on the blog games!

Amazing job!

+1
Level 65
Jan 20, 2022
Thanks a lot! I really enjoyed researching and writing this!
+5
Level 43
Jan 20, 2022
If Ethaboo or Chen think on putting any 9 on this blog at BG, I give up of anything. This is just a wonder, with all the charts, and explanations. Couldn’t understand a half of them, as my mathematics are really superficial lol. I want more of these.

Should I really ask Stewart about an option of nominate blogs? At this point, is more than needed.

+1
Level 65
Jan 20, 2022
If you want extra explanation on anything let me know! I'm happy to add additional explanations! And thanks a lot for your kind words!
+1
Level 43
Jan 20, 2022
Well, it was understandable at the most points, but as my math sucks, I had some difficulties at some points, but the examples you added were really helpful! :)
+1
Level 65
Jan 21, 2022
That's understandable, glad you enjoyed!
+1
Level 66
Jan 20, 2022
Yes, blog nominations please! We could have 3 levels of blogs, the JetPunk blog with the very best, featured blogs which are not shown on the front page but still have a special page, and just normal blogs.
+1
Level 43
Jan 20, 2022
There is already “JetPunk Blog”, with the featured blogs, including AFC one, of course! Maybe if it ever happen (add an “n” before the “ever”), the possible place will be there.
+1
Level 66
Jan 20, 2022
Yeah, that's why I said.
+1
Level 43
Jan 20, 2022
Oh, it’s because I read about the special page, but I didn’t read well the rest lol
+3
Level 68
Jan 21, 2022
When I made the blog update originally, I had an idea to do some kind of "trending" blogs, but I could never figure a way to make it fair and not manipulatable.

That's the big problem with nominations for quizzes, they are often biased towards particular types of quizzes.

+2
Level 62
Jan 20, 2022
This is absolutely great! Very interesting as well lol with numbers but a bit too advanced for my brain to handle haha.
+1
Level 65
Jan 20, 2022
Thanks!
+2
Level 69
Jan 20, 2022
Phenomenal. My brain can't wrap around a lot of the stuff here lol. If I ever happen to need to calculate the randomness of something, I now know who to ask. This blog is the textbook example of what a blog should be.
+2
Level 65
Jan 20, 2022
Thanks a lot!
+2
Level 53
Jan 20, 2022
I wish I was able to work the charts in the way. Excellent job. This deserves high praise.
+2
Level 70
Jan 20, 2022
I agree with the previous comments, this is indeed excellent. Definitely one of the best blogs to be on the RUB in the last few months.
+1
Level 65
Jan 21, 2022
Thanks a lot!
+3
Level 68
Jan 20, 2022
Ouch, my head!
+3
Level 68
Jan 20, 2022
Excellent blog. The Saudi Arabia exception is interesting.
+2
Level 65
Jan 21, 2022
When I first saw that I was really surprised and thought I'd made a mistake somewhere :). But indeed it's super interesting that it jumps out so much
+4
Level 62
Jan 20, 2022
The beauty of Number Theory :)

Great Blog!

+1
Level 65
Jan 21, 2022
Thank you!
+2
Level 55
Jan 21, 2022
One of the most tineresting blog I read so far!

It's always interesting to know more about JetPunk, and you made efforts to make something "complex" partially easy to understand for most of your readers. Is there a better way to start blogging? Congrats for this excellent blog!

+2
Level 65
Jan 21, 2022
Thanks a lot Baptiste!
+3
Level 75
Jan 26, 2022
Best blog of the year so far???!!
+4
Level 65
Jan 26, 2022
Even though the year is only 26 days old, thanks!
+1
Level 68
Jun 28, 2024
Happy Australia Day!
+1
Level 68
Sep 20, 2024
Hey, thank you for the vlog, it was an interesting read. I understand that this is supposed to be a blog and not a thesis, but there are multiple questions I have to your calculations. First of all, did you program your own random generator to run the Monte Carlo simulation? Otherwise if you've used the Jetpunk generator, how would you estimate the true chances when we don't know yet, whether the JP generator is truly random? Second, why is the spread of the island picks necessary to estimate the variance? Could you not have used the mean chance for getting a borderless game and the sample size to calculate variance? Lastly, would it not have been a more definite approach to just compare mean chances between your own generator and JP generator and run a t-test with 5% significance level? Then you'd have been able to either reject or not reject the Null hypothesis that the mean chances are equal and you would not need to guess the randomness.