Dijkstra on sparse graphs¶
For the statement of the problem, the algorithm with implementation and proof can be found on the article Dijkstra's algorithm.
Algorithm¶
We recall in the derivation of the complexity of Dijkstra's algorithm we used two factors:
the time of finding the unmarked vertex with the smallest distance
In the simplest implementation these operations require
It is clear, that this complexity is optimal for a dense graph, i.e. when
To accomplish that we can use a variation of multiple auxiliary data structures.
The most efficient is the Fibonacci heap, which allows the first operation to run in
As a compromise you can use data structures, that perform both types of operations (extracting a minimum and updating an item) in
C++ provides two such data structures: set
and priority_queue
.
The first is based on red-black trees, and the second one on heaps.
Therefore priority_queue
has a smaller hidden constant, but also has a drawback:
it doesn't support the operation of removing an element.
Because of this we need to do a "workaround", that actually leads to a slightly worse factor
Implementation¶
set¶
Let us start with the container set
.
Since we need to store vertices ordered by their values set
pairs are automatically sorted by their distances.
const int INF = 1000000000;
vector<vector<pair<int, int>>> adj;
void dijkstra(int s, vector<int> & d, vector<int> & p) {
int n = adj.size();
d.assign(n, INF);
p.assign(n, -1);
d[s] = 0;
set<pair<int, int>> q;
q.insert({0, s});
while (!q.empty()) {
int v = q.begin()->second;
q.erase(q.begin());
for (auto edge : adj[v]) {
int to = edge.first;
int len = edge.second;
if (d[v] + len < d[to]) {
q.erase({d[to], to});
d[to] = d[v] + len;
p[to] = v;
q.insert({d[to], to});
}
}
}
}
We don't need the array set
to store that information, and also find the vertex with the shortest distance with it.
It kinda acts like a queue.
The main loops executes until there are no more vertices in the set/queue.
A vertex with the smallest distance gets extracted, and for each successful relaxation we first remove the old pair, and then after the relaxation add the new pair into the queue.
priority_queue¶
The main difference to the implementation with set
is that in many languages, including C++, we cannot remove elements from the priority_queue
(although heaps can support that operation in theory).
Therefore we have to use a workaround:
We simply don't delete the old pair from the queue.
As a result a vertex can appear multiple times with different distance in the queue at the same time.
Among these pairs we are only interested in the pairs where the first element is equal to the corresponding value in
By default a priority_queue
sorts elements in descending order.
To make it sort the elements in ascending order, we can either store the negated distances in it, or pass it a different sorting function.
We will do the second option.
const int INF = 1000000000;
vector<vector<pair<int, int>>> adj;
void dijkstra(int s, vector<int> & d, vector<int> & p) {
int n = adj.size();
d.assign(n, INF);
p.assign(n, -1);
d[s] = 0;
using pii = pair<int, int>;
priority_queue<pii, vector<pii>, greater<pii>> q;
q.push({0, s});
while (!q.empty()) {
int v = q.top().second;
int d_v = q.top().first;
q.pop();
if (d_v != d[v])
continue;
for (auto edge : adj[v]) {
int to = edge.first;
int len = edge.second;
if (d[v] + len < d[to]) {
d[to] = d[v] + len;
p[to] = v;
q.push({d[to], to});
}
}
}
}
In practice the priority_queue
version is a little bit faster than the version with set
.
Interestingly, a 2007 technical report concluded the variant of the algorithm not using decrease-key operations ran faster than the decrease-key variant, with a greater performance gap for sparse graphs.
Getting rid of pairs¶
You can improve the performance a little bit more if you don't store pairs in the containers, but only the vertex indices.
In this case we must overload the comparison operator:
it must compare two vertices using the distances stored in
As a result of the relaxation, the distance of some vertices will change. However the data structure will not resort itself automatically. In fact changing distances of vertices in the queue, might destroy the data structure. As before, we need to remove the vertex before we relax it, and then insert it again afterwards.
Since we only can remove from set
, this optimization is only applicable for the set
method, and doesn't work with priority_queue
implementation.
In practice this significantly increases the performance, especially when larger data types are used to store distances, like long long
or double
.