Graph Algorithms

Representing networks, breadth- and depth-first traversal, topological sorting, shortest paths with Dijkstra, and minimum spanning trees with Kruskal's algorithm.

Look back over everything this course has covered so far, and a quiet pattern emerges: arrays are sequences, trees are hierarchies, and both are, in a precise sense, special cases of something more general — a collection of things, and a collection of connections between them. That more general shape is a graph, and an enormous fraction of the interesting structure in the world turns out to be, underneath, exactly that: cities and the roads between them, web pages and the links between them, people and their friendships, tasks and their dependencies, computers and their network connections, course topics and the prerequisite relationships that determine what order you can learn them in (an example that should feel close to home, on a site whose entire purpose is organizing exactly that kind of structure). Graphs are not one more topic to add to your toolkit — they are closer to the toolkit’s frame, the general shape that arrays, trees, and most of the structures you’ve already met turn out to be special cases of. This chapter introduces the vocabulary for talking about graphs precisely, the two fundamental ways of exploring them, and several algorithms that answer some of the most naturally useful questions you can ask of a network: what’s reachable from here, in what order must these things happen, what’s the shortest route, and what’s the cheapest way to connect everything?

Vocabulary and Representation

A graph G = (V, E) consists of a set of vertices (or nodes) V and a set of edges E, where each edge connects a pair of vertices. Edges may be directed (a one-way relationship — “task A must finish before task B can start”) or undirected (a symmetric one — “city A and city B are connected by a two-way road”), and they may carry a weight (a cost, distance, or capacity associated with traversing them) or be unweighted (every connection counts equally). Most of the algorithms in this chapter work, with only minor adjustments, across all four combinations of these two choices — and recognizing which combination a real-world problem maps onto is usually the very first design decision you’ll need to make.

Before you can write a single graph algorithm, you need a way to represent one in memory — and, just as with the choice between an array and a linked list back in the Data Structures course, this representation choice has real consequences for which operations are cheap and which are expensive:

Representation	Space	”Are `u` and `v` connected?"	"List all of `u`’s neighbours”	Best suited to
Adjacency matrix — an `n × n` grid where cell `[u][v]` records whether (and how expensively) an edge connects them	`Θ(n²)`	`Θ(1)`	`Θ(n)` — must scan a whole row	Dense graphs; frequent “are these connected?” queries
Adjacency list — for each vertex, a list of the vertices it connects to	`Θ(V + E)`	`Θ(deg(u))` in the worst case	`Θ(deg(u))` — exactly as long as it needs to be	Sparse graphs (the overwhelmingly common case in practice); traversal-heavy algorithms

Real-world graphs are overwhelmingly sparse — a social network with a billion users still sees each person connect to only a few hundred others, not the other 999,999,900 — which is precisely why the adjacency list dominates in practice, and why every algorithm in this chapter is built on top of one. Here is the representation we’ll use throughout, kept deliberately simple with fixed-size arrays so the algorithms stay the visible focus rather than memory-management bookkeeping (in production code you would typically back this with the dynamically-growable arrays from the Complexity Analysis chapter’s IntVector):

#include <stdbool.h>

#define MAX_VERTICES 100

typedef struct {
    int to;
    int weight;          /* ignored (treat as 1) for unweighted graphs */
} Edge;

typedef struct {
    Edge edges[MAX_VERTICES];
    int count;
} AdjacencyList;

typedef struct {
    int num_vertices;
    AdjacencyList adj[MAX_VERTICES];
} Graph;

void add_edge(Graph *g, int u, int v, int weight) {
    g->adj[u].edges[g->adj[u].count++] = (Edge){ .to = v, .weight = weight };
    g->adj[v].edges[g->adj[v].count++] = (Edge){ .to = u, .weight = weight };  /* omit this line for directed graphs */
}

Pause and think: That single commented-out-in-spirit line — the one that adds the edge in both directions — is the entire difference between representing a directed and an undirected graph in this scheme. Take a moment to convince yourself that’s really true: is there any algorithm later in this chapter whose correctness would be affected by which of the two forms its input graph used? (The honest answer is “yes, more than one” — keep that question in your pocket as you read; it will resurface, with teeth, in the discussion of cycle detection.)

Breadth-First Search: Exploring in Widening Rings

Breadth-first search (BFS) explores a graph the way ripples spread across a pond: visit the source, then everything exactly one edge away, then everything exactly two edges away, and so on — never visiting something at distance k until everything at distance k - 1 has been visited. This “widening rings” discipline is enforced with a queue — the same first-in-first-out structure from the Data Structures course’s Stacks and Queues chapter, doing here exactly the job it was designed for: guaranteeing that vertices are processed in the order they were discovered, which is exactly the order of their distance from the source.

#include <stdbool.h>

void bfs(const Graph *g, int source, int distance[]) {
    bool visited[MAX_VERTICES] = { false };
    int queue[MAX_VERTICES];
    int front = 0, back = 0;

    for (int i = 0; i < g->num_vertices; i++) {
        distance[i] = -1;                      /* -1 means "not reachable from source" */
    }

    visited[source] = true;
    distance[source] = 0;
    queue[back++] = source;

    while (front < back) {
        int u = queue[front++];
        for (int i = 0; i < g->adj[u].count; i++) {
            int v = g->adj[u].edges[i].to;
            if (!visited[v]) {
                visited[v] = true;
                distance[v] = distance[u] + 1;  /* one ring further out than u */
                queue[back++] = v;
            }
        }
    }
}

Each vertex enters the queue exactly once (guarded by visited), and each edge is examined at most twice (once from each endpoint) — giving Θ(V + E) overall, the gold-standard running time for a graph algorithm: it cannot do better than looking at every vertex and every edge at least once, and BFS does no more than a small constant multiple of that minimum.

The single most useful fact about BFS — worth committing to memory, because it’s the reason BFS is the correct tool for an entire category of problems rather than merely a tool: in an unweighted graph, the distance BFS records for each vertex is its true shortest-path distance from the source, measured in number of edges. This isn’t a coincidence or a happy side effect — it follows directly from the “widening rings” discipline: by the time BFS first reaches a vertex v, it has already exhausted every vertex at every smaller distance, so there is no shorter way to have reached v than the one that just did.

Try it yourself: bfs records distances, but not the actual shortest paths. Extend it to also fill in a predecessor[] array — predecessor[v] should record the vertex that first discovered v during the search. Then write a second function that uses predecessor[] to reconstruct and print the actual shortest path from the source to any reachable vertex, by walking backward from the destination to the source and reversing the result. (If that “compute the answer first, then walk backward through recorded choices to reconstruct the actual solution” pattern feels familiar, it should — it’s the exact same idea you used to recover the chosen items from the 0-1 knapsack table in the Dynamic Programming chapter. The specific structure changes from problem to problem; the shape of “answer, then path-to-the-answer, via a backward walk through recorded decisions” turns out to be one of the most recurring ideas in all of algorithm design.)

Depth-First Search: Committing to a Path Until It’s Exhausted

Depth-first search (DFS) takes the opposite philosophy: plunge as deep as possible along one path, and only back up when you run out of unexplored options. Where BFS needed an explicit queue to enforce its discipline, DFS’s natural discipline — “fully finish exploring from here before trying anything else” — is exactly what recursion already gives you for free, courtesy of the call stack you studied closely in the Recursion and Divide and Conquer chapter:

void dfs_visit(const Graph *g, int u, bool visited[], void (*process)(int)) {
    visited[u] = true;
    process(u);                                  /* do whatever work you need at u */
    for (int i = 0; i < g->adj[u].count; i++) {
        int v = g->adj[u].edges[i].to;
        if (!visited[v]) {
            dfs_visit(g, v, visited, process);   /* plunge deeper before considering u's other neighbours */
        }
    }
}

void dfs(const Graph *g, int source, void (*process)(int)) {
    bool visited[MAX_VERTICES] = { false };
    dfs_visit(g, source, visited, process);
}

Notice what’s not in this code: there’s no explicit stack data structure anywhere. The call stack is the stack — every recursive call to dfs_visit pushes a new frame recording exactly where to resume once the deeper exploration finishes, and returning from a call pops back to precisely that point. This is the cleanest illustration you’ll meet of an idea the Recursion chapter asked you to internalize: recursion is not a separate mechanism from explicit stack-based iteration — it’s the same mechanism, with the bookkeeping handled for you. (And, true to that chapter’s warning about deep recursion, a sufficiently long path in a sufficiently large graph can exhaust the call stack — in production graph libraries, DFS is often written iteratively, with an explicit stack, for exactly this reason. Rewriting dfs_visit that way is an excellent, concrete way to test whether that equivalence has truly clicked for you.)

What DFS Reveals About a Graph’s Shape

BFS answers “how far is everything from here?” — a question about distance. DFS answers a different family of questions, about structure: as it explores, every edge it encounters falls into one of a small number of categories, and which categories appear tells you something deep about the graph itself.

The category that matters most for what follows is the back edge: an edge from a vertex to one of its own ancestors in the DFS exploration — a vertex currently “open” (visited, but not yet fully finished being explored). A directed graph contains a cycle if and only if a DFS over it encounters at least one back edge. This gives you a remarkably short, complete cycle-detection algorithm: run DFS, tracking which vertices are currently on the exploration path (as opposed to merely “visited at some point”) — typically with a small three-state colour scheme (white = untouched, grey = currently being explored, black = fully finished) — and declare victory the instant you find an edge pointing at a grey vertex.

Pause and think: Here is the question this chapter promised would resurface: does this cycle-detection rule — “a back edge means a cycle” — work correctly on an undirected graph, completely unmodified? Think very carefully about the edge you add from v back to u immediately after you’ve just arrived at v from u (recall: add_edge inserts the connection in both directions). Is that the kind of “cycle” the algorithm is supposed to be detecting — and if it isn’t, but the naive rule would flag it as one anyway, what is the smallest, most surgical change to the algorithm’s bookkeeping that distinguishes “a genuine cycle” from “the edge I just walked in on, pointed back at me”? Working through this distinction by hand, on a graph with as few as three vertices, is one of the most reliable ways to convince yourself that “directed” and “undirected” are not cosmetic labels on a graph — they change what its edges mean, and therefore what your algorithms are actually allowed to assume.

Topological Sorting: Putting Dependencies in a Valid Order

Here is a question that shows up constantly, in disguises that don’t always look like graph problems at first glance: given a set of tasks and a set of “must happen before” constraints between them, is there a valid order to do them all in — and if so, what is one? Course prerequisites (you can’t take “Algorithms” before “Programming Fundamentals”), build systems (you can’t link an object file before compiling its source), spreadsheet recalculation (you can’t compute a cell before the cells it references) — all of these are, structurally, the exact same problem: find a linear ordering of a directed acyclic graph’s (DAG’s) vertices such that every edge u → v places u somewhere before v.

That word “acyclic” is doing serious work in that definition — and DFS’s cycle-detection ability is precisely what lets you check it. If the dependency graph contains a cycle, no valid order can possibly exist (task A needs B, which needs C, which needs A — there is no first task to start with), and a correct topological sort routine must be able to report that, rather than silently producing a nonsensical answer.

Assuming the graph is acyclic, here is a wonderfully economical way to produce a valid order, built directly on the DFS you already have:

void topological_sort_visit(const Graph *g, int u, bool visited[], int result[], int *result_count) {
    visited[u] = true;
    for (int i = 0; i < g->adj[u].count; i++) {
        int v = g->adj[u].edges[i].to;
        if (!visited[v]) {
            topological_sort_visit(g, v, visited, result, result_count);
        }
    }
    result[(*result_count)++] = u;          /* record u only once everything reachable from it is done */
}

/* Produces a valid order in `result[0 .. n-1]`, READ BACKWARD (last-filled entry first). */
int topological_sort(const Graph *g, int result[]) {
    bool visited[MAX_VERTICES] = { false };
    int result_count = 0;
    for (int u = 0; u < g->num_vertices; u++) {
        if (!visited[u]) {
            topological_sort_visit(g, u, visited, result, &result_count);
        }
    }
    /* reverse result[0 .. result_count-1] in place to get a forward-readable order */
    for (int i = 0, j = result_count - 1; i < j; i++, j--) {
        int temp = result[i]; result[i] = result[j]; result[j] = temp;
    }
    return result_count;
}

The justification for why this produces a valid order rewards sitting with for a moment, because it’s a clean piece of reasoning rather than a fact to memorize: a vertex u is recorded only after the recursive call has completely finished exploring everything reachable from it — which means every vertex v with an edge u → v is recorded before u is (it had to finish, and be recorded, before u’s call could return). Reversing the list at the end therefore guarantees u appears before v for every edge u → v — precisely the property a topological order requires. The proof and the algorithm are, in a real sense, the same six lines, viewed from two different angles — a satisfying instance of something you’ll notice more and more as your algorithmic vocabulary grows: a clean correctness argument and a clean implementation are frequently just two descriptions of the same underlying insight.

Try it yourself: Sketch a small dependency graph for four or five courses on a hypothetical syllabus (say: “Programming Fundamentals” → “Data Structures” → “Algorithms,” with “Discrete Mathematics” feeding into “Algorithms” independently). Run topological_sort over it by hand, tracking the recursion exactly as the code specifies — note the order in which vertices finish (get appended to result), not the order they’re first visited. Then reverse that list. Does the order you get match an order you’d actually be allowed to take those courses in? Now add a deliberate cycle (make “Programming Fundamentals” depend on “Algorithms”) and trace through what topological_sort_visit does — at exactly which line does the algorithm, as written, fail to notice anything has gone wrong? That failure point is precisely the gap a production-quality version must close by folding in the cycle-detection bookkeeping from the previous section.

Shortest Paths in Weighted Graphs: Dijkstra’s Algorithm

BFS finds shortest paths beautifully — but only when every edge costs the same “one step” to traverse. The moment edges carry different weights (a highway segment that covers 50 km in one “edge” versus a side street covering 2 km), “fewest edges” and “least total cost” stop being the same question, and BFS’s guarantee evaporates. Dijkstra’s algorithm answers the weighted version: given a source vertex, find the cheapest route from it to every other vertex, in a graph where all edge weights are non-negative.

The strategy: maintain a current best known distance to every vertex (initially 0 for the source and ∞ for everything else), then repeatedly do the following — pick the not-yet-finalized vertex with the smallest known distance, mark it finalized, and “relax” every edge leaving it (that is, check whether routing through it offers a cheaper way to reach its neighbours than what’s currently recorded, and update if so):

#include <limits.h>

void dijkstra(const Graph *g, int source, int dist[]) {
    bool finalized[MAX_VERTICES] = { false };
    for (int i = 0; i < g->num_vertices; i++) {
        dist[i] = INT_MAX;
    }
    dist[source] = 0;

    for (int iteration = 0; iteration < g->num_vertices; iteration++) {
        /* pick the not-yet-finalized vertex with the smallest known distance */
        int u = -1;
        for (int i = 0; i < g->num_vertices; i++) {
            if (!finalized[i] && (u == -1 || dist[i] < dist[u])) {
                u = i;
            }
        }
        if (u == -1 || dist[u] == INT_MAX) {
            break;                              /* everything remaining is unreachable */
        }
        finalized[u] = true;

        for (int i = 0; i < g->adj[u].count; i++) {
            int v = g->adj[u].edges[i].to;
            int new_distance = dist[u] + g->adj[u].edges[i].weight;
            if (new_distance < dist[v]) {
                dist[v] = new_distance;          /* found a cheaper route to v, through u */
            }
        }
    }
}

Recognizing the Shape: This Is a Greedy Algorithm

Pause and notice what you’re looking at. “Repeatedly select the most promising candidate by one simple criterion, commit to it permanently, and never revisit that decision” — that is exactly the shape of every algorithm in the Greedy Algorithms chapter, right down to the structure of the proof obligation it incurs. Dijkstra’s algorithm is a greedy algorithm, and the question that chapter trained you to ask immediately applies in full force: how do we know that finalizing the vertex with the smallest current distance is always safe — that we’ll never need to revisit it once a vertex farther away gets explored?

The proof is a direct cousin of the exchange arguments you built by hand in that chapter, here taking the specific form of what’s usually called the cut property: suppose, for contradiction, that the true shortest path to the vertex u we’re about to finalize actually runs through some other, not-yet-finalized vertex x. Since edge weights are non-negative, the path to u through x cannot be shorter than the path to x alone — which means dist[x] <= dist[u]. But we chose u specifically because it had the smallest current distance among unfinalized vertices — so dist[u] <= dist[x]. Combine the two and dist[x] = dist[u], meaning routing through x could not possibly have produced a strictly shorter path to u than the one already recorded. No contradiction survives — the greedy choice was safe all along.

Pause and think: That proof leaned on exactly one fact about the edge weights — find the precise sentence where it happened. (It’s the step that turns “a path through x” into “at least as long as the path to x alone.”) Now ask: what happens to that one specific step the moment a negative-weight edge enters the picture — an edge that could make a longer-looking detour secretly cheaper overall? Try to sketch a tiny three- or four-vertex graph with one negative edge where Dijkstra’s algorithm — running exactly as written above, making no exceptions for the new kind of edge — finalizes a vertex, declares its distance permanent, and is then proven wrong by an edge it hasn’t looked at yet. (Algorithms that remain correct in the presence of negative weights — most famously the Bellman-Ford algorithm, which relaxes every edge, V - 1 times over, rather than committing to a greedy visiting order — exist precisely to repair this gap. They run slower, in exchange for working under weaker assumptions about their input — which should sound familiar: it’s the exact shape of trade-off you met when comparing binary search to interpolation search, and fractional to 0-1 knapsack. The specific algorithms keep changing; that one shape of trade-off — a faster algorithm, purchased by assuming more about the input, which becomes a liability the moment that assumption fails — keeps reappearing, because it is one of the truly fundamental tensions in all of algorithm design.)

The Cost of “Find the Smallest” — and a Familiar Optimization

Look closely at the inner loop that finds u: it scans every vertex, every single iteration, to find the minimum — O(V) work, repeated V times, for O(V²) just on the “pick the next vertex” bookkeeping. If this feels familiar, it should — it’s the exact same extract_min-scans-the-whole-array pattern from the Huffman coding algorithm in the previous chapter, and it has the exact same fix: a min-heap (the very structure heapsort built and climbed through in the Sorting chapter, oriented toward smallest-on-top rather than largest) turns “find and remove the smallest” into an O(log V) operation. With a heap-backed priority queue, Dijkstra’s running time falls to O((V + E) log V) — the standard, production-grade bound you’ll find quoted in every serious reference on the algorithm. Three different chapters, three different problems, and the same exact bottleneck-and-fix pattern each time — which is precisely the kind of recurring shape this course has been quietly training you to notice on sight, because once you see it once, you’ll start seeing it everywhere.

Minimum Spanning Trees: Connecting Everything as Cheaply as Possible

Here is a different, equally natural question to ask of a weighted, undirected, connected graph: if you need to connect every vertex together — by selecting some subset of the edges — what’s the cheapest possible way to do it, in terms of total edge weight? (Think: wiring every building on a campus with the least total cable, or connecting every city in a region with the least total cost of new road construction.) A spanning tree is any subset of edges that connects all vertices without forming a cycle (exactly V - 1 edges, for V vertices — any fewer and something would be disconnected; any more, and a cycle would necessarily appear); a minimum spanning tree (MST) is one with the smallest possible total edge weight.

Kruskal’s algorithm finds one with a greedy rule that should, by now, feel almost familiar in its phrasing: sort all edges by weight, ascending, and add each one to your growing forest unless doing so would create a cycle.

typedef struct {
    int u, v, weight;
} WeightedEdge;

int compare_by_weight(const void *a, const void *b) {
    return ((const WeightedEdge *) a)->weight - ((const WeightedEdge *) b)->weight;
}

“Unless doing so would create a cycle” is doing a lot of quiet work in that sentence — it requires, at every step, an efficient answer to “are these two vertices already connected through edges I’ve already chosen?” That is precisely the question the disjoint-set (union-find) data structure exists to answer, in very nearly constant time per query:

typedef struct {
    int parent[MAX_VERTICES];
    int rank[MAX_VERTICES];
} DisjointSet;

void ds_init(DisjointSet *ds, int n) {
    for (int i = 0; i < n; i++) {
        ds->parent[i] = i;
        ds->rank[i] = 0;
    }
}

int ds_find(DisjointSet *ds, int x) {
    if (ds->parent[x] != x) {
        ds->parent[x] = ds_find(ds, ds->parent[x]);   /* path compression: flatten the tree as we go */
    }
    return ds->parent[x];
}

bool ds_union(DisjointSet *ds, int a, int b) {
    int root_a = ds_find(ds, a);
    int root_b = ds_find(ds, b);
    if (root_a == root_b) {
        return false;                                  /* already connected: union would create a cycle */
    }
    if (ds->rank[root_a] < ds->rank[root_b]) {
        int temp = root_a; root_a = root_b; root_b = temp;
    }
    ds->parent[root_b] = root_a;                       /* attach the shorter tree under the taller one */
    if (ds->rank[root_a] == ds->rank[root_b]) {
        ds->rank[root_a]++;
    }
    return true;
}

With that machinery in place, Kruskal’s algorithm is almost an anticlimax — a short loop that simply does what its description promised:

int kruskal_mst(WeightedEdge edges[], int num_edges, int num_vertices, WeightedEdge mst_out[]) {
    qsort(edges, num_edges, sizeof(WeightedEdge), compare_by_weight);

    DisjointSet ds;
    ds_init(&ds, num_vertices);

    int mst_count = 0;
    for (int i = 0; i < num_edges && mst_count < num_vertices - 1; i++) {
        if (ds_union(&ds, edges[i].u, edges[i].v)) {   /* true: no cycle formed; keep this edge */
            mst_out[mst_count++] = edges[i];
        }
    }
    return mst_count;                                  /* should equal num_vertices - 1 if the graph is connected */
}

qsort dominates at O(E log E); with path compression and union-by-rank, the disjoint-set operations are so close to constant time that they barely register in the total. Sort, then greedily commit to each candidate that doesn’t create a problem — and refuse to revisit that decision. You have now seen this exact rhythm in activity selection, fractional knapsack, Huffman coding, and now minimum spanning trees. It is not a coincidence that the same rhythm keeps appearing — it’s a sign that you’re looking at a genuine family of problems sharing a deep structural similarity, one you’re now equipped to recognize the instant you meet a new member of it.

Try it yourself: The proof that Kruskal’s greedy rule is safe is called the cut property, and it has precisely the shape of every exchange argument you’ve built so far: take any partition of the vertices into two non-empty groups (a “cut”); among all edges crossing between the two groups, the lightest one is guaranteed to belong to some minimum spanning tree. Try to construct that proof yourself, in the style of the activity-selection argument from the previous chapter — assume an MST exists that doesn’t contain the lightest crossing edge, and show you can always swap something out for it without increasing the total weight. (Hint: adding that lightest crossing edge to any spanning tree that lacks it creates exactly one cycle — and that cycle is guaranteed to contain at least one other crossing edge, which you are now free to remove.) If you complete this proof on your own, on paper, without peeking at a reference — pause and notice that you have just independently derived one of the classic results in algorithmic graph theory, using nothing but the proof technique this course handed you two chapters ago. That is not a small thing, and it’s worth letting yourself feel it.

Prim’s Algorithm: Growing a Tree One Edge at a Time

Kruskal’s algorithm treats the edge set globally — sort all edges, add the cheapest safe one. Prim’s algorithm grows the MST differently: start at an arbitrary vertex and repeatedly add the cheapest edge that crosses from the current tree to a vertex not yet in it. The result is always the same set of MST edges; only the order of discovery differs.

The structure is nearly identical to Dijkstra’s: maintain a key[] array (the cheapest edge weight connecting each vertex to the current tree) and a in_mst[] flag. At each step, pick the not-yet-included vertex with the smallest key, add it to the tree, and update its neighbors’ keys.

#include <limits.h>

void prim_mst(const Graph *g, int parent[]) {
    int key[MAX_VERTICES];
    bool in_mst[MAX_VERTICES];

    for (int i = 0; i < g->num_vertices; i++) {
        key[i] = INT_MAX;
        in_mst[i] = false;
        parent[i] = -1;
    }
    key[0] = 0;                                    /* start at vertex 0 */

    for (int iter = 0; iter < g->num_vertices; iter++) {
        /* pick the not-yet-included vertex with the smallest key */
        int u = -1;
        for (int v = 0; v < g->num_vertices; v++)
            if (!in_mst[v] && (u == -1 || key[v] < key[u])) u = v;

        in_mst[u] = true;

        for (int i = 0; i < g->adj[u].count; i++) {
            int v = g->adj[u].edges[i].to;
            int w = g->adj[u].edges[i].weight;
            if (!in_mst[v] && w < key[v]) {
                key[v]    = w;                     /* cheaper connection found */
                parent[v] = u;                     /* record which tree vertex offers it */
            }
        }
    }
    /* MST edges: (parent[v], v) for all v != 0 */
}

Compare prim_mst against dijkstra line by line: the only difference is in the relaxation — Dijkstra updates dist[v] = dist[u] + weight (path cost), while Prim updates key[v] = weight (edge cost alone). Same greedy skeleton, different quantity being minimized. Both run in O(V²) with a plain array, or O((V + E) log V) with a min-heap.

Bellman-Ford: Shortest Paths with Negative Weights

Dijkstra’s algorithm cannot handle negative edge weights — its greedy “once finalized, never revisited” step breaks the moment a later edge could improve an already-finalized path. Bellman-Ford trades the greedy commitment for exhaustive re-relaxation: relax every edge, V - 1 times over.

Why V - 1 rounds? The shortest path in any graph without a negative cycle visits at most V - 1 edges. After round k, dist[v] is guaranteed to hold the correct shortest-path cost using at most k edges. After V - 1 rounds it is fully converged.

#include <limits.h>

typedef struct { int u, v, weight; } WEdge;

void bellman_ford(const WEdge edges[], int num_edges, int num_vertices,
                  int source, int dist[]) {
    for (int i = 0; i < num_vertices; i++) dist[i] = INT_MAX;
    dist[source] = 0;

    for (int round = 0; round < num_vertices - 1; round++) {
        for (int e = 0; e < num_edges; e++) {
            int u = edges[e].u, v = edges[e].v, w = edges[e].weight;
            if (dist[u] != INT_MAX && dist[u] + w < dist[v]) {
                dist[v] = dist[u] + w;             /* relax */
            }
        }
    }

    /* one extra round: any further relaxation means a negative cycle is reachable */
    for (int e = 0; e < num_edges; e++) {
        int u = edges[e].u, v = edges[e].v, w = edges[e].weight;
        if (dist[u] != INT_MAX && dist[u] + w < dist[v]) {
            printf("Negative cycle detected\n");
            return;
        }
    }
}

O(V · E) — slower than Dijkstra’s O((V + E) log V), but correct under conditions where Dijkstra is not. Bellman-Ford is also the foundation of distance-vector routing protocols (RIP, BGP), where each router only knows its immediate neighbors and must converge on shortest paths through iterative message-passing — a distributed version of the same V - 1 rounds of relaxation.

Floyd-Warshall: All-Pairs Shortest Paths

Dijkstra and Bellman-Ford answer “shortest paths from one source.” Floyd-Warshall answers “shortest paths between every pair of vertices” in one pass, using a beautifully concise triple-nested loop. It handles negative edge weights (but not negative cycles) and is the right tool when you need the full V × V distance matrix.

The key idea: for each intermediate vertex k, ask — “does routing through vertex k shorten the path from i to j?” Process every k from 0 to V-1 in order; after processing k, dist[i][j] is the shortest path that uses only vertices 0 through k as intermediates.

#define INF INT_MAX / 2    /* half of INT_MAX to avoid overflow when adding */

void floyd_warshall(int dist[][MAX_VERTICES], int n) {
    /* initialise: dist[i][j] should already hold the direct edge weight,
       INF if no direct edge, and 0 on the diagonal */

    for (int k = 0; k < n; k++) {
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                if (dist[i][k] + dist[k][j] < dist[i][j]) {
                    dist[i][j] = dist[i][k] + dist[k][j];
                }
            }
        }
    }
    /* after all k: dist[i][j] is the true shortest-path distance from i to j */
}

Three lines of algorithmic content, O(V³) time, O(V²) space. The INF / 2 trick avoids signed integer overflow when adding two INF values together in the relaxation check. After the algorithm, dist[i][i] < 0 for any i indicates a negative cycle reachable from i.

Pause and think: Floyd-Warshall is a DP algorithm — the “intermediate vertex set grows by one each outer iteration” framing is exactly the DP pattern of building solutions from subsets of allowed resources. The recurrence dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]) is structurally identical to dp[i][j] = min(dp[i][j], dp[i][k] + dp[k+1][j]) from matrix-chain multiplication — “can I improve by splitting through an intermediate element?” Compare this against Dijkstra: both compute shortest paths, but Floyd-Warshall uses DP (considering all intermediates systematically) while Dijkstra uses a greedy priority order. What does that difference tell you about which algorithm applies when?

Maximum Flow: The Ford-Fulkerson Method

The maximum flow problem asks: given a directed graph where each edge has a capacity (the maximum flow it can carry), and a designated source s and sink t, what is the maximum amount of “stuff” (data, water, traffic) that can flow from s to t simultaneously, respecting all capacity constraints?

Ford-Fulkerson answers this by repeatedly finding an augmenting path — a path from s to t through which more flow can be pushed — and pushing as much flow as possible along it until no augmenting path remains. The residual graph tracks, for each edge, how much more flow it can carry (its unused capacity) and how much can be cancelled (its current flow, which can be “pushed back”). Using BFS to find augmenting paths (the Edmonds-Karp variant) guarantees O(V · E²) worst-case time.

#define MAX_V 100

int capacity[MAX_V][MAX_V];   /* capacity[u][v]: remaining capacity on edge u→v */

int bfs_find_path(int source, int sink, int n, int parent[]) {
    bool visited[MAX_V] = { false };
    int queue[MAX_V]; int front = 0, back = 0;
    visited[source] = true; parent[source] = -1;
    queue[back++] = source;
    while (front < back) {
        int u = queue[front++];
        for (int v = 0; v < n; v++) {
            if (!visited[v] && capacity[u][v] > 0) {   /* residual edge exists */
                visited[v] = true; parent[v] = u;
                if (v == sink) return 1;
                queue[back++] = v;
            }
        }
    }
    return 0;
}

int edmonds_karp(int source, int sink, int n) {
    int parent[MAX_V];
    int max_flow = 0;

    while (bfs_find_path(source, sink, n, parent)) {
        /* find the bottleneck capacity along the augmenting path */
        int path_flow = INT_MAX;
        for (int v = sink; v != source; v = parent[v]) {
            int u = parent[v];
            if (capacity[u][v] < path_flow) path_flow = capacity[u][v];
        }
        /* update residual capacities forward and backward */
        for (int v = sink; v != source; v = parent[v]) {
            int u = parent[v];
            capacity[u][v] -= path_flow;
            capacity[v][u] += path_flow;               /* back-edge allows cancellation */
        }
        max_flow += path_flow;
    }
    return max_flow;
}

The back-edge capacity[v][u] += path_flow is the algorithm’s most surprising line — it creates a “reverse” edge that lets a future BFS cancel previously committed flow if doing so opens a better augmenting path. Without this, the algorithm can get stuck in locally reasonable but globally suboptimal choices.

Maximum bipartite matching — assigning workers to tasks, students to projects, or packets to servers — reduces directly to max-flow: add a super-source connected to all left-side nodes and a super-sink connected to all right-side nodes (all capacities 1), then run Edmonds-Karp. The max flow equals the maximum number of disjoint matches.

Choosing the Right Tool

Question you’re asking	Reach for	Running time	Key requirement
”What’s reachable from here, and how many steps away?”	BFS	`Θ(V + E)`	None — works on any graph
”Does this graph contain a cycle? In what order must these dependencies run?”	DFS / topological sort	`Θ(V + E)`	Topological sort requires a DAG
”What’s the cheapest route from here to everywhere?”	Dijkstra	`O((V + E) log V)` with a heap	Non-negative edge weights
”What’s the cheapest route, and some edges might be negative?”	Bellman-Ford	`O(V · E)`	Handles negative weights; detects negative cycles
”What’s the cheapest path between every pair?”	Floyd-Warshall	`O(V³)`	Handles negative weights (not negative cycles)
“What’s the cheapest way to connect everything?”	Kruskal or Prim	`O(E log E)`	Undirected, connected, weighted graph
”How much can flow from source to sink?”	Edmonds-Karp	`O(V · E²)`	Directed graph with edge capacities

Try it yourself, as a closing reflection: every algorithm in this chapter is, at its heart, answering one of two questions — “in what order can/should I visit things?” or “what is the cheapest way to get from here to there/everywhere?” Take each algorithm in the table and sort it into one of those two bins (some may straddle both — and noticing which ones, and why, is itself revealing). Then ask the harder question: is there a third fundamental kind of question about graphs — one this chapter hasn’t touched at all — that feels like it ought to belong on this list? (As one provocation to chew on: every algorithm here assumes a single source, a single destination, or a single connected structure spanning everything. What happens to all of these techniques the instant you ask “what’s the maximum amount of stuff that can flow from this source to that destination, given that every edge has a capacity limiting how much can pass through it per unit time” — the network-flow family of problems? That question turns out to connect, in genuinely surprising ways, back to bipartite matching, project scheduling, and even certain problems you’ll meet framed very differently in the NP-completeness chapter ahead — a reminder that the map of “what connects to what” in this course is itself, fittingly, a graph worth exploring.)

Bringing It Together: Practice as a Problem-Solver

Diagnose before you code. For each scenario, name the single graph algorithm from this chapter best suited to it, and justify your choice in one sentence by naming the specific structural feature of the scenario that makes that algorithm the right fit (not just “it finds shortest paths” — why does this scenario’s structure match what that algorithm assumes about its input?): (a) finding the fewest number of bus transfers between two stops in a city, where every route is considered equally convenient; (b) determining a valid build order for a project’s source files given a list of #include dependencies; (c) finding the cheapest flight itinerary between two cities, where ticket prices vary wildly by route; (d) deciding which subset of proposed network cables to actually lay, to connect every office at minimum total cost.
Trace it by hand. Draw a small weighted, undirected, connected graph with six or seven vertices and eight or nine edges of varying weights (include at least one vertex reachable by more than one route, with meaningfully different total costs). Run Dijkstra’s algorithm from one chosen source entirely by hand, writing down the full dist[] array and which vertex gets finalized at every single iteration — not just the final answer. Then run Kruskal’s algorithm on the same graph, writing down which edges get considered, which get accepted, and which get rejected (and why, in terms of the disjoint-set state at that moment). Compare the edges Dijkstra’s shortest-path tree ends up implicitly using against the edges Kruskal’s MST selects. Are they the same set? Should they be? (If your intuition says “shortest paths from one source” and “cheapest way to connect everything” sound like they ought to produce the same structure, construct the smallest example you can find where they visibly don’t — and use it to articulate, precisely, what subtle difference in the two questions causes the difference in their answers.)
Extend a proof you’ve already built. Go back to the cut-property proof you sketched for Kruskal’s algorithm, and adapt it — with the smallest changes you can manage — into a justification for Prim’s algorithm, which builds an MST differently: starting from a single arbitrary vertex, it repeatedly adds the cheapest edge that connects the current tree to any vertex not yet in it (notice: this is structurally very close to Dijkstra’s “repeatedly finalize the closest candidate” rhythm, just optimizing a different quantity). Implement Prim’s algorithm in C — you’ll find you can reuse a remarkable fraction of Dijkstra’s structure, changing only the quantity being compared — and run it on the same graph from exercise 2. Do Kruskal’s and Prim’s algorithms produce the same minimum spanning tree? If your graph has any edges of equal weight, do they still agree — and if not, does that mean one of them is wrong, or does it reveal something about MSTs that the term “the minimum spanning tree” was quietly assuming, and shouldn’t have been?
Reflection. Across the last three chapters — Greedy Algorithms, Dynamic Programming, and now this one — you have repeatedly met the same underlying question, dressed in different clothes each time: “can I trust a simple, locally-reasoned rule to produce a globally correct answer, and how do I actually know, rather than hope?” Pick any two algorithms from this chapter (Dijkstra and Kruskal are a particularly rich pair, since both are explicitly greedy, yet solve visibly different problems) and write three or four sentences comparing the specific property of the input each one’s correctness proof leans on (non-negative weights, for Dijkstra; the cut property, for Kruskal) — and what, concretely, goes wrong, in each case, the moment that property is no longer guaranteed. If you can hold both proofs in your head at once, side by side, and articulate exactly where and why they diverge, you have built something far more durable than the ability to recite either algorithm: the instinct to ask, of any greedy-looking rule you ever invent yourself, exactly which fact about my input am I secretly relying on — and what happens to my answer the day that fact stops being true?