The data product in question is a Directed Acyclic Graph depicting the connections and interrelations between Sid Meier’s Civilization 6’s Civics, Technologies, and corresponding boosts. That is a visual representation of how different aspects of the Civic and Technology trees relate to each other.
Data Products are the outputs of Data Science or Data Engineering efforts and typically has the following workflow:
All Data Products serve to answer a question, preferably an interesting one. In Civ 6, the Civics and Technology trees are separate; however, they are interlinked in some subtle and non-subtle ways. The purpose of this product is to answer the question:
How are the Civics, Technologies, and general gameplay intertwined?
The Civic and Technology tree data is easily accessible in the GamePlay SQLite database, and a set of SQL queries went a long way. Gathering the boosts data was a little trickier. The raw data is in the gameplay DB; however, it isn’t easily usable in the graph. In the end, I extracted the raw data into Microsoft Excel and exported the Boost related nodes (vertexes) and edges as CSV files for addition to the output graph.
The nodes and edges are also colour coded and styled to help you interpret the graph created using the Python NetworkX module. Some nodes and edges were changed (or even deleted) to reduce unnecessary complexity.
I also used this opportunity to learn Python and was pleasantly surprised by the ease of use and the whole data science eco-system available.
Courtesy of NetworkX, the graph (network) properties are:
- Name: Civ 6 R&F Civic, Technology, and Boosts network
- Type: DiGraph
- Number of nodes: 357
- Number of edges: 526
- Average in degree: 1.4734
- Average out degree: 1.4734
- Network density: 0.004
In summary, this tells us that we have a sparse network. That is, this is about a far from the typical social network data science examples popular on the internet as we can get.
Degree is a measure of how many in and out connections each node has. The assumption here is that the essential nodes have more relationships. The nodes with a high degree as usually known as the Hubs. In this graph, the top 10 nodes by degree are:
- Settle First City (10)
- Natural History (10)
- Mining (10)
- Radio (9)
- Guilds (8)
- Humanism (8)
- Ideology (8)
- Education (8)
- Industrialization (8)
- Flight (8)
Betweenness centrally is good at finding nodes that connect two otherwise disparate parts of a network. That is, highlighting the Brokers. In this graph, the top 10 nodes by betweenness centrally are:
- Flight (0.026)
- Industrialization (0.025)
- Scientific Theory (0.022)
- Radio (0.022)
- Humanism (0.020)
- Guilds (0.019)
- Defensive Tactics (0.019)
- Banking (0.015)
- Gunpowder (0.015)
- Feudalism (0.014)
A basic graph showing the potential routes to Archery:
A more complex map showing the possible ways to Stirrups:
I didn’t find any real surprises in the Hub or Broker nodes. I was however pleasantly surprised with the route(s) to Stirrups. I knew it was broad but seeing it was still an exciting experience. Also, as it contains most key civics and many useful technologies, it would make sense to look at minimising the costs (and turns) associate with getting Stirrups.
As it stands, the graph is an excellent base to generate routes to other node images, and I look forward to exploring these.
Turning to NetworkX, and the available data science toolset in Python. I can honestly say that I was happy to learn how easy to use they are. And, how much depth there is to the tools. Off course, I am merely scraping the surface, and there is loads of complexity under the covers.
If you are reading this, you have been disseminated to. I doubt this effort will feature in any academic publications, but the internet is a big place, and you never know who may find this useful.
You can find the original scripts and data in my Civ 6 Scripts repository on Github.