Build a prosperous settlement in a unique simulation of a feudal economy - each family runs their shop, owns their goods, and trades with other villagers in a free market. You will manage the land, build houses, organise events, set taxes, and balance supply and demand while engaging in diplomacy!
The public beta for the campaign update is soon coming to an end. In fact, we expect it to go live as early as next week! Besides re-designing the tutorial experience into fifteen distinct missions, each focusing on a unique game mechanic, we are adding one more major improvement to the game - overhauled pathfinding system.
It is improving performance dramatically and because we went through quite the journey of solving it over the years, maybe it would be a fun read to share it with you. So let's get into it!
[h2]What is Pathfinding?[/h2]
If you are not sure what pathfinding is, let me first introduce it a little bit. In the world of AI, we typically refer to NPCs or other intelligent creatures - in our case the villagers - as agents. Agents can do a lot of things, but almost all of them involve getting from one place to another in the game world. So pathfinding is just that - mathematically solving the problem of finding a viable path between two points on the map.
In most cases, this is about as complicated as solving a maze with a pen. You start at a starting point and gradually explore all directions until you reach your goal. If you are smart, you prioritize your choices when to take turns by taking guesses. For example, if the goal is on the right end of the maze, it does not make so much sense to explore paths on the left before you explore the path to the right. It could be wrong, but on average, making these kinds of smart guesses will make you faster. So the art of making the right kind of guess is also a big part of pathfinding.
[h2]Common Approach[/h2]
At first, I utilized a very common algorithm known as A*. It is quite simple and sufficient for most games. The main premise is to make a good guess about the next direction, while exploring multiple paths at the same time. As it alternates between each direction, it keeps track of how long the path is so far and a guess about how many steps are left before it can reach the goal from where it currently is. Then in every step, it simply explores the direction that is most likely to be the shortest until either all options are exhausted, or the goal has been reached.
This algorithm is often quite fast, but when paired up against hundreds of villagers each searching multiple paths every frame, sometimes across the whole map (which is not small, especially the experimental ones), it was soon obvious this would not hold on its own.
[h2]Memory To The Rescue[/h2]
The first thing I considered was to have villagers remember things more, so they do not have to calculate the paths over and over again. After all, if the maze did not change, why not just store already calculated paths in the memory and simply ask for it again later? Furthermore, each discovered path works in both directions, so if they are stored, only half of the maze needs to be solved.
This was a nice idea in theory, but the reality was quite different. On larger maps, memory was bloating really fast, reaching several gigabytes of extra storage. And unfortunately, it was not even making anything faster for two main reasons.
First is that our villagers recognize differences in where they are allowed to go. Not every villager sees the same "maze". Some can pass through some doors, some can lockpick them, while others must avoid them completely. So they are relatively unlikely to ever share already calculated paths with another villager. Most paths were never reused, until nearly all path-finding maps were fully solved and already taking gigabytes of data.
The second problem was the fact that our world is very dynamic and constantly changes. Whenever someone builds a wall, any paths that are going through this location must be thrown away. If we destroy a wall, there could be an already stored path that could now reach its goal faster. We can keep it anyway and sacrifice the accuracy for performance, but this can get quite extreme and it is not very practical. And since there is no good way to tell which paths would be affected, in the end, all of them had to be discarded with a single change on the world map.
The third and small issue was that all of the memory would be lost whenever you would reload a save file, so attempts were made to also serialize this data. This however led to extreme loading times and very large save files, so I abandoned it quickly as well.
[h2]Multicore Approach[/h2]
Most CPUs nowadays have multiple cores that can be leveraged to improve performance, but this does not happen automatically so as my next attempt to make things better I went in to write an architecture that can leverage them.
Every agent now registers the paths they need and puts them in a queue. Then every frame, the pathfinding service would create a batch of multiple requests and schedule them for completion. This allowed me to compute multiple paths at the same time each on a different core during a single frame. The downside was that there was now a guaranteed one-frame delay for each request, but it was a small price for the gains in performance.
Having a queue of requests also allowed me to prevent spikes by limiting how many paths get computed every frame. This eliminated nearly all performance spikes unless a rare very long path would have to be calculated. On the other hand, if a lot of agents requested paths at the same time, they would now have to wait several frames before their request was completed and it introduced a new problem - villagers periodically stopping and standing while waiting for the pathfinders' response.
This was especially apparent every time a wall was built or destroyed. All villagers had to stop their movement, request a new path, and then wait for several frames to get them. This made any large maps with hundreds of villagers frustrating to play (though they were not exactly playable before either).
Despite this, performance gains were still very significant. I also converted the code to use native data structures in order to utilize Burst compiler. To cover how it works would be rather complex so let's just say it made each iteration of the pathfinding run a lot faster on the CPU. And this is where we were until now.
[h2]Hierarchical Pathfinding[/h2]
Pathfinding is now becoming pretty fast but it still has two major downsides - very long paths still take forever to compute, and any time something changes anywhere on the map, all agents must request a new path. It was clear that hierarchical pathfinding would be necessary to solve this.
This one is more difficult to explain, but in essence, it works by separating the map into individual clusters - rectangles of a uniform size. What we want is that any change inside of this rectangle will only affect the rectangle they are part of, and its direct neighbors. If we can somehow do that, not only we can now store some paths in the memory without having to constantly throw them away, but perhaps we can also leverage this to make very long paths to be calculated much faster.
To do this, we will search on the border of each cluster, and create a connection with its neighbor when we can see that an agent could cross the border there. We can also merge some of them to reduce their amount (with some limitations that I will omit here for the sake of simplicity). Then we will calculate a path between all connections inside of the cluster to get something like this:
[img]{STEAM_CLAN_IMAGE}/37633948/8e0de334b7940f93f203ba9834dd7d71ebcbce6b.png[/img]
Now each time a world changes, we only need to repeat this process for the cluster that contains this change. It also turns out that solving just the connections between these entrance points and connecting them in a chain will be very close to the most optimal path for any combination of points on the map. So our pathfinding almost magically becomes a lot more simple.
Because we know which points can reach other points and how long is the path that connects them, we just need to connect the dots on the way to our destination and search for some final bits around the starting point and the destination. If this is too complex to understand, think of it as an abstract orientation map. Instead of having to look directly under your feet with every step, there are direction signs all around you and you just need to blindly follow them.
On a large map of 192x192, traditional A* had to solve sometimes up to 40 000 nodes to conclude its search. On a map of this size, we would create on average between 1000 to 1500 crossings. This makes the maze that we need to solve about 5-10% of its original size. So this abstract map pre-computes about 90% of the complexity of the pathfinding that is shared between all possible paths, leaving the final 10% to figure out some details specific to each combination of the starting and destination points.
This makes even very long paths solved blazingly fast for only a small amount of added memory and a tiny loss in accuracy. And because searching individual paths is so much faster, it does not really make sense to store any of them anymore, since the cost of managing this storage would likely outweigh the benefits of it. What makes sense to do here, is for each agent that follows a path, to also remember which clusters it will go through, and have them request a new path only if these clusters get modified before they reach the final destination. Occasionally they might take a longer path when a shorter path just opened up for them, but these scenarios should be very rare and we can avoid the awkward occasional village-wide halting of movement.
Every path can also be re-traced and smoothed out for only a small extra performance cost to make the final path nearly exactly the same as any high-cost low-level A* algorithm would find. In the end, I decided to make this step optional so players can toggle this in settings if their CPU can handle the workload.
[h2]What is the Catch?[/h2]
Of course, all optimization has its tradeoffs. In this case, building a wall needs to rebuild the cluster immediately, which in rare cases can lead to a performance spike, if the cluster is particularly complex. We are also taking up a bit more memory than before to do this. Searching the path in this abstract map is also very difficult to do natively so the benefits of Burst have been somewhat reduced. Loading times were also extended by up to 10 seconds on very large maps as this abstract map needs to be solved before the game begins. Multiple cores are still leveraged and they make updating a cluster a lot faster, but they do not have a lot of impact on individual searches.
I hope you enjoyed this rather technical read! I would like to open up about the complexity of our game, to shed more light about the challenges we face when optimizing it. What we do is not typical for indie games and it is not always easy to see as much of that work happens under the hood.
If you did enjoy this article, let me know in the comments and tell me if you would like to see more of them! There is a lot of interesting stuff to cover, so I would be happy to share them with you. In the meantime, come hang out with the community on our [url=https://discord.lordsandvilleins.com/]Discord[/url], and celebrate the Strategy Fest with us!
Michal
Honestly Games