A fascinating AI technique is used in one of the best roguelikes, Brogue. The author called it "Dijkstra Maps". Basically it's about generating a heatmap for all the squares on the level. You can start with 0 at player position, and from that point use a simple floodfill algorithm, putting down increasing numbers with each step. Then a monster simply examines all adjacent squares and selects the one with lowest number. This has at least two notable advantages:
1. You only need to do the path generation once, and it scales very well with the number of monsters.
2. It's really good at combining several concerns, because you can generate a couple of heat maps for different concerns and add them up. For example one heat map is about proximity to player. Another could be about proximity to health pickups, or proximity to cover, or proximity to open space if a monster likes to keep distance and shoot. If a gas trap is triggered, you can use a "danger" heat map. Then a monster can easily get closer to player and at the same time choose the path which has fewer harmful effects.
That's why centaur archers in Brogue are so annoying, monsters avoid traps intelligently, monster groups avoid wasting their numerical advantage by chasing player through a corridor.
James Gosling's Emacs screen redisplay algorithm also used similar "dynamic programming techniques" to compute the minimal cost path through a cost matrix of string edit operations (the costs depended i.e. on the number of characters to draw, length of the escape codes to insert/delete lines/characters, padding for slow terminals, etc).
>Gosling Emacs was especially noteworthy because of the effective redisplay code, which used a dynamic programming technique to solve the classical string-to-string correction problem. The algorithm was quite sophisticated; that section of the source was headed by a skull-and-crossbones in ASCII art, warning any would-be improver that even if they thought they understood how the display code worked, they probably did not.
/* 1 2 3 4 .... Each Mij represents the minumum cost of
+---+---+---+---+----- rearranging the first i lines to map onto
1 | | | | | the first j lines (the j direction
+---+---+---+---+----- represents the desired contents of a line,
2 | | \| ^ | | i the current contents). The algorithm
+---+---\-|-+---+----- used is a dynamic programming one, where
3 | | <-+Mij| | M[i,j] = min( M[i-1,j],
+---+---+---+---+----- M[i,j-1]+redraw cost for j,2
4 | | | | | M[i-1,j-1]+the cost of
+---+---+---+---+----- converting line i to line j);
. | | | | | Line i can be converted to line j by either
. just drawing j, or if they match, by moving
. line i to line j (with insert/delete line)
*/
This paper presents an algorithm for updating the image displayed on a conventional video terminal. It assumes that the terminal is capable of doing the usual insert/delete line and insert/delete character operations. It takes as input a description of the image currently on the screen and a description of the new image desired and produces a series of operations to do the desired transformation in a near-optimal manner. The algorithm is interesting because it applies results from the theoretical string-to-string correction problem (a generalization of the problem of finding a longest common subsequence), to a problem that is usually approached with crude ad-hoc techniques.
[...]
6. Conclusion
The redisplay algorithm described in this paper is used
in an Emacs-like editor for Unix and a structure editor.
It's performance has been quite good: to redraw
everything on the screen (when everything has changed)
takes about 0.12 seconds CPU time on a VAX 11/780
running Unix. Using the standard file typing program,
about 0.025 seconds of CPU time are needed to type one
screenful of text. Emacs averages about 0.004 CPU
seconds per keystroke (with one call on the redisplay per
keystroke).
Although in the interests of efficency we have stripped
down algorithm 5 to algorithm 6 the result is still an
algorithm which has a firm theoretical basis and which is
superior to the usual ad-hoc approach.
I wonder how the ratio of display overhead to cost to compute the minimal cost path has changed over the years? At some point it must be cheaper to do a "dumb" redraw if the display hardware is fast enough?
Also, it seems like this might also be useful for computing minimal diffs, eh?
It was definitely worth it if you had a 300 baud modem and a lightly loaded VAX mostly to yourself. But it became an overkill with higher baud rates and high speed networks.
It's based on the "string-to-string correction problem" (edit distance / Levenshtein distance), combined with "dynamic programming" (finding the best cost path through a cost matrix), which is (kind of) how diff works, on a line-by-line basis for inserted and deleted lines, and then on a character-by-character basis between changed lines. Each "edit" has a "cost" determined by the number of characters you have to send to the terminal (length of the escape codes to perform an edit in multiple possible ways, versus just redrawing the characters). Emacs computes its way through a twisty maze of little escape codes each time it updates the screen.
H. M. Wagner and M. J. Fischer. "The string-to-string correction problem." JACM 21, 1 (January 1974), 168-173
>In computer science, the string-to-string correction problem refers to determining the minimum number of edit operations necessary to change one string into another (i.e., computing the shortest edit distance). A single edit operation may be changing a single symbol of the string into another, deleting, or inserting a symbol. The length of the edit sequence provides a measure of the distance between the two strings.
>In computing, the diff utility is a data comparison tool that calculates and displays the differences between two files. Unlike edit distance notions used for other purposes, diff is line-oriented rather than character-oriented, but it is like Levenshtein distance in that it tries to determine the smallest set of deletions and insertions to create one file from the other. The diff command displays the changes made in a standard format, such that both humans and machines can understand the changes and apply them: given one file and the changes, the other file can be created.
>In information theory, linguistics and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.
>In computational linguistics and computer science, edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelled word by selecting words from a dictionary that have a low distance to the word in question. In bioinformatics, it can be used to quantify the similarity of DNA sequences, which can be viewed as strings of the letters A, C, G and T.
Here's a Levenshtein demo that shows you the cost matrix with the shortest path highlighted. Try calculating the distance between "fotomat" and "tomato", and you can see the shortest path which has a cost of 3 edits (1: delete f at beginning, 2: delete o at beginning, 3: insert o at end):
>Levenshtein distance is obtained by finding the cheapest way to transform one string into another. Transformations are the one-step operations of (single-phone) insertion, deletion and substitution. In the simplest versions substitutions cost two units except when the source and target are identical, in which case the cost is zero. Insertions and deletions costs half that of substitutions. This demonstration illustrates a simple algorithm which basically looks at all of the different ways for operations to transform one string to another.
Cheers! I'm a big fan of yours BTW. Long ago, in a forum far away, your were the one to clue me in to NeWS (and pie menus too!) Blew open my concepts of what a display system could be. I'll always be grateful to you Don Hopkins.
I believe this is "flow fields" or "potential fields" -- as mentioned by other commenters, scaling it with graph size and destination-count are two of the main issues with it.
In SimAirport we use flow fields, they're great for crowds & they scale incredibly well with agent count. We use them for "long distance" traversals, and then we use "local" A-star like grid search for the "last-mile" of an agent's journey.
For flow-fields, we start by dividing everything up into "sectors" based on game logic -- and then further, into "sub-sectors", essentially just basic abstractions of the graph with smaller node counts. It does have an impact on path quality but for the scale we operate at in SimAirport the trade-off is a no brainer. The abstractions then limit the flow-field's graph size & the destination-count that any given subsector (abstracted-node) can have -- essentially it's a hierarchical (pseudo HPA-star) approach to flowfield pathfinding / navigation, and the abstraction also has major benefits for reachability queries & game-logic driven "rules" (ie load-balancing of capacity-constrained paths [stairs/escalators], one-way paths [ie directed graph algos], etc).
Sounds like a classic potential field implementation, and yeah, they're very useful. There are two problems: scaling with map size (large maps or high detail maps), and update cost in action-heavy games (if the field source moves on every frame). But for small roguelikes it's a very good fit.
Presumably you could set the fill radius to a maximum corresponding to the largest detection distance possible (for the monsters you have in-game or even in-level)?
There's no point filling an entire level if monsters only react when the player gets within say 30 tiles. Even line of sight should have limits.
You could add something like an A* heuristic on top where monsters are drawn to the general direction of the player based on a vector, but switch to shortest-path when they're close. (but then you worry about getting stuck I guess)
Performance is a serious problem still. Well, if the world needs to be updated often (multiple times a second).
Consider a semi-open map with 30 tiles sides, we're talking a thousand tiles in total and/or paths a hundred tiles long. For the effect to propagate to the 30th tile away, the propagation algorithm has to iterate more than 30 times.
Graph algorithms are generally slow (whole milliseconds on large graphs) but repeated many times over they can get truly horrendous.
One trick is to spread the computation out over time, if you don't need to do it all at at once every frame. Since the enemies don't move that fast, a bit of delay might be good enough, depending on what you're tracking.
SimCity has several layers like pollution, land value, etc, which slowly diffuse over time. But it only does that computation every so often, not every frame. It has a 16 phase simulation clock, and it scans the cells of the map in eight stripes over eight steps, then scans different layers like taxes, traffic and rate of growth decay, power, pollution and land value, police coverage and crime, population density, fire coverage and disasters, and the RCI valves. (That made it possible to run on a C64!)
Chaim Gingold's SimCity Reverse Diagrams show how the different phases of "Simulate()" perform different kinds of analysis over time, and how the different map layers interact with each other.
>SimCity reverse diagrams, by Chaim Gingold (2016).
>These reverse diagrams map and translate the rules of a complex simulation
program into a form that is more easily digested, embedded, disseminated, and
and discussed (Latour 1986).
>If we merge the reverse diagram with an interactive approach—e.g. Bret Victor’s Nile Visualization (Victor 2013), such diagrams could be used generatively, to describe programs, and interactively, to allow rich introspection and manipulation of software.
>Latour, Bruno (1986). “Visualization and cognition”. In: Knowledge and Society 6 (1986), pp. 1– 40.
>Librande, Stone (2010). “One-Page Designs”. Game Developers Conference. 2010.
>Victor, Bret (2013). “Media for Thinking the Unthinkable”. MIT Media Lab, Apr. 4, 2013.
Yeah this is an interesting problem for exploring stuff like recursive/non-recursive code, the overhead of conditional statements when you have to run thousands of them, etc. I've seen some examples online where people claim a few million voxels a second (I've not tested though).
I think this is also how 3D printers work (effectively). They do a form of scan-line filling to minimise the number of jumps between sections of filament.
Thanks for the alternative name! Helps further research the technique. The core idea is relatively simple so I'm not surprised these are known. But finding a name of something you know is not easy with just a search engine.
Sure thing! And as I mentioned in another comment, if you google for "potential fields" and "flow fields" in games, there's a whole bunch of papers and talks on this.
Now that you mentioned "flow field", I remember seeing it in a Supreme Commander AI demonstration technique. Two columns of tanks can navigate through each other seamlessly, without chaos. And in realtime. Imagine seeing this in the days of Warcraft 1, where it was common for a unit to use left hand rule across the entire map because the bridge was momentarily occupied.
Interesting, because these kinds of potential fields are similar to the "predictive map" representations we see in the firing rates of hippocampal cells.
And also called... dynamic programming in some communities. On of the most classical algo, just before the more interesting class of problems where the map size is too big to do it
I encountered a similar problem in a very different situation a few years ago: I had to route video streams across a network of nodes connected by point-to-point connections.
At first I thought I would apply a pathfinding algorithm like A* to route each stream but given that it was an arbitrary graph and not a 2D map there was no obvious heuristic to guide the algorithm, so it devolved into an exhaustive search.
But then I realized that I had relatively few nodes (a dozen at most) and many more streams to route, so it made more sense to compute the distance between every pair of nodes once (n^2 with the number of nodes using a trivial implementation, less if you optimize it but I didn't even need to) and then I could lookup the path between any two nodes in linear time. Given that this was all running on an underpowered embedded system it made a significant difference in performance when routing hundreds of streams.
I'm not very good at big-O. I meant that in order to build the path I would need to lookup the neighboring nodes to find the closest to the target, then just to that node and continue until the end. So it's linear with the average distance I guess.
Yeah, linear with regards to the number of nodes - double the amount of nodes and you'll need to do double the look-ups; and constant with regards to the number of streams - no matter how many streams, always the same number of look-ups.
What you're looking for is influence maps. See the 2 articles linked below. They helped me tremendously when I was developing AI for a 2D game.
Another unrelated thing I found very useful is Floyd Warshall. It's a graph algorithm to compute efficiently ALL shortest paths between any 2 nodes. Real time pathfinding was too slow to use, had to precompute wherever possible, turns out most of the world is fixed and can be precomputed.
Not really - a shortest path algorithm outputs a single path between a start and an end point.
This gives you a value (distance to player) for every square on the board; you can do a lot of things with that data, like choose between multiple "best steps" based on some other preference, or choose "best steps" for a large number of monsters efficiently.
No, it doesn't give you a distance-to-player for every square on the board. It is closer to giving you an ordered list of 'most recent player positions' - notably, the point directly in front of the player is not necessarily in this list. These positions contain "time since player was here," which is not a distance to player. (It could be thought of as a maximum bound on distance-to-player, I suppose).
If the player's route is not straight, some squares with "fresher scent" may be physically more distant from the player than squares with less fresh scent.
Not exactly - a shortest path algorithm would lead to the monsters "anticipating" the player coming around the obstacle and cutting them off.
The algorithm is summarized in the article as:
"When an enemy enters the Chase state it tries to raycast to the player and if nothing is in the way - chase em! If something is in the way, it goes through the scent trail in order and tries to raycast to each scent until it can see one, then - chase it!"
With complex geometry, this will frequently lead to a non-optimal route, and is probably more fun to play against.
I've always known this as flow field pathfinding and it can get tricky if you have a dynamic map or enough changing that you need to recalculate often. Here's a random article I'm sure I read years ago describing it.
IIRC he uses the same data structure but with hill climbing vs gradient descent to guide monkeys/imps/anything else that steals an item and runs from you. Clever.
1. You only need to do the path generation once, and it scales very well with the number of monsters. 2. It's really good at combining several concerns, because you can generate a couple of heat maps for different concerns and add them up. For example one heat map is about proximity to player. Another could be about proximity to health pickups, or proximity to cover, or proximity to open space if a monster likes to keep distance and shoot. If a gas trap is triggered, you can use a "danger" heat map. Then a monster can easily get closer to player and at the same time choose the path which has fewer harmful effects.
That's why centaur archers in Brogue are so annoying, monsters avoid traps intelligently, monster groups avoid wasting their numerical advantage by chasing player through a corridor.
He described it in detail in this article: http://www.roguebasin.com/index.php?title=The_Incredible_Pow...
And in case you're wondering, Brogue source code is licensed on AGPLv3.