I recently completed an experiment to see what applying Data Science, or more specifically Machine Learning using XGBoost and SHAP could tell me about where to settle the first city in Sid Meier’s Civilization (Rise & Fall). That is, built a human interpretable model of this decision.
Civilization is a turn-based strategy game centred on building a civilization on a macro-scale, from prehistory up to the near future.
I’ve been a fan of the series since the last century. I love the complexity of it, the near-perfect balance between variability and predictability, how you need to adapt your approach to each scenario. Sure, there are some rote elements. Still, overall, the complex interaction between the various game mechanics are difficult to quantify definitively.
Where to settle your first city is the first, and one of the most significant decisions you need to take. The challenge is that you have limited information available to you and balancing all the potential requirements for success is complicated, with no clear rules.
This is where the Machine Learning part comes into play. The idea of feeding this limited information into a human interpretable model is appealing, and I decided to build it.
I will skip the low-level details of how exactly I built the model and jump straight to the results. If you want to understand how I made the model, you can visit the GitHub repo.
One benefit of using XGBoost to build the model is that I could go and build up this result from the individual trees and eventually create an algorithm to repeat it. Still, I don’t have to and using SHAP does a great job of doing so for me.
The SHAP impact on model Summary graph above lists the key features contributing to a successful outcome in order of influence and is easily interpretable by humans.
Doing this exercise a few years ago would have been very challenging, and I am continuously impressed with the rich eco-system of tools available to perform these sorts of experiments.