Simulating Cost Flow

This post assumes knowledge of min/max cost flow and a rough understanding of the high-level methods for solving (successive shortest path and cycle canceling). Knowledge of specific algorithms and implementation details are not expected. Also solutions to some problems will use Aliens trick.


The cost flow problem is a well-known problem in literature that a whole slew of competitive programming problems can be formulated as. Unfortunately, competitive programming problems typically have constraints set too high for directly running even the most state of the art cost flow algorithm. However, in certain problems the flow network will have a very specific setup that allows us to figure out greedily the optimal flow (i.e. proving greedy with flow) or “simulate” the flow algorithm faster with data structures. That is what this post will be about.

I will say that I do not claim novelty over these ideas. This idea is (unsurprisingly) well-known in China. This particular post and some of the example problems are taken from this Chinese blog.

I will also describe everything below in terms of min cost flow, max cost is of course analogous. When describing flow network constructions, I will shorthand with (cost,capacity) for describing edges.

High-level Methods for Cost Flow

Most cost flow algorithms fall into one of these categories of high-level approaches. The two in particular that we will look at are:

  1. Successive Shortest Path: Repeatedly find a min cost augmenting path (a path with minimum total cost where all edges can still contain more flow) from source to sink and send flow through that path.
  2. Cycle Cancelling: Repeatedly find a negative cost cycle (a cycle with negative total cost where all edges can still contain more flow) and send flow through that cycle.

Note that in the cycle cancelling paradigm, there is no concept of source and sink, but you can solve for source and sink by adding an edge from sink to source with cost of and capacity of k for the min cost k-flow (or capacity if you want min cost max flow).

Min Cost Flow is Convex

Min cost flow as a function of amount of flow is convex. This makes intuitive sense, as you can always rearrange the augmenting paths you choose to pick less steep ones (lower cost) before more steep ones (higher cost). This fact is useful as it allows us to use Aliens trick.

Codeforces 802O: April Fools’ Problem

Statement Summary: You are preparing and printing k problems for a contest over the course of n days. On the ith day, you can prepare a problem for cost ai and print a problem for cost bi. You must prepare a problem before printing it. In other words, if you prepare a problem on day i, you can only print it on some day ji. You can prepare and print at most one problem per day. What is the minimum cost of preparing and printing k problems?

Constraints: 1kn5105,1ai,bi109


Let’s first think about solving this with flows. One straightforward way is to create 2n+2 nodes: a source s and sink t, plus nodes representing ai and bi for each i. We add the following edges:

  • s to each ai node with (ai,1)
  • each bi node to t with (bi,1)
  • each ai node to bj node where ij with (0,1)

The answer is the min cost k-flow of this network.

Image 1

Example for n=3

Now let’s try to optimize using one of two methods. Both are possible and lead to different solutions for this problem.

Successive Shortest Path

This is the method described in the official editorial.

First, let’s simplify this network to not use Ω(n2) edges. Consider a slightly different setup with n+2 nodes: a source s and sink t, plus nodes 1 through n. We add the following edges:

  • s to each i with (ai,1)
  • each i to t with (bi,1)
  • i to i+1 with (0,)

I claim the answer is also the min cost k-flow of this network. Essentially, a matching consists of picking some ai, waiting zero or more days (represent by taking some number of ii+1 edges), and then pick some bj where ij.

Image 1

Example for n=4

An augmenting path consists of sending flow from s to some i, then from i to j, and finally from j to t.

  • si and jt must of course have no flow going through them.
  • If ij, this is ok. We add +1 flow to each edge on the path from i to j.
  • If i>j, then we are sending flow along a reverse edge and cancelling out existing flow, so every edge on the path from i to j must have at least 1 flow, and the augmenting path adds 1 flow to each path edge.

And in the successive shortest path algorithm, we simply need to choose the augmenting path with minimum cost each time. So we need to choose a pair of indices i,j with minimum ai+bj, and either ij or there exists flow on every edge on the path between i and j.

So we need to be able to add +1 or 1 to a range and recompute the minimum cost valid (i,j) pair. This sounds like a job for lazy segment tree! We just repeat k times of querying the segtree for (i,j) and either add +1 or 1 to all edges in between i and j, giving us an O(klogn) solution.

The details for determining the minimum cost valid (i,j) pair are quite tedious, but the TLDR is that you have to store first and second minimums of the amount of flow through any edge in a subarray. I have an old submission that implements this idea with a ton of variables in the segtree node. It can probably be reduced with more thought.

Cycle Cancelling

To drastically simplify our analysis, we will apply Aliens trick. Binary search on some reward c, so now there is no requirement of at least k problems anymore, but instead you get a reward of c for completing one problem. So preparing a problem on day i and printing on day j costs ai+bjc. So there is an edge from t to s with (c,). We binary search for the minimum c that we complete at least k problems. This is valid because we know min cost flow is convex with respect to the amount of flow.

Now consider incrementally adding in nodes in decreasing order (from n to 1) and consider how the optimal flow changes after each additional node. When we add node i, we add edges sai, bit, and aibj for all ij. Upon adding node i, we consider the possibilities for how a new negative cycle could form.

  1. saibjts with cost ai+bjc. This is essentially matching ai with some unmatched bj.
  2. saibjaks with cost aiak. This is cancelling out some existing flow passing through sakbj and is equivalent to changing the matching to match bj with ai instead of ak.

Image 3 Image 4

Examples of negative cycles of the first and second types

Those are actually the only two cases we need to consider. For example, you might think we need to consider some cycle of the form saibjtbkals. This cycle would represent taking some matched pair (ai,bj) instead of (al,bk). However, if (al,bk) was originally matched in the first place, that would imply al+bkc, as otherwise tbjaist would be a negative cycle with weight calbk and we would have had to handle that before adding node i. And if saibjtbkals was a negative cycle, that would imply ai+bj<al+bk and therefore ai+bj<c, so we could instead take the negative cycle saibjts. In fact, we would ultimately converge to that state regardless, because if we took saibjtbkals first, then salbkts would form a negative cycle afterwards, so the end result after resolving all negative cycles is identical.

You can also confirm that after taking a negative cycle of one of the two types listed above (whichever one is more negative), we will not create any more additional negative cycles. So essentially, each time we add a node i to the network, we take at most one additional negative cycle.

Example: Can taking a type 2 cycle lead to creating a type 1 cycle?

If we break up some old pair (ak,bj) and instead match (ai,bj), is it then possible for ak to be paired with some other bl?

If we are breaking up the pair, that implies ai<ak. Now suppose we pair ak with bl, implying ak+bl<c. Substituting ai<ak yields ai+bl<c. Furthermore, if we compare aiak and ai+blc, because ak>blc, ai+blc<aiak, so it is more optimal to match ai with bl in the first place.


Therefore, we can simply maintain a priority queue of potential negative cycles and iterate i from n to 1. When we match (ai,bj), we insert ai into the priority queue to allow for breaking up that matching and pairing bj with something else in the future. When we leave bi unmatched, we insert bic into the priority queue to allow for matching with that bi in the future. And at each step, if ai plus the minimum value in the priority queue is less than 0, then we take that option. The overall complexity is O(nlognlogA) (but with much better constant than the successive shortest path solution).

Code for Clarity
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
auto solve = [&] (int c) -> pair<int, long long> {
    int cnt = 0;
    long long cost = 0;
    priority_queue<pair<int, int>, vector<pair<int, int>>, greater<pair<int, int>>> pq;
    for (int i=n-1; i>=0; i--) {
        pq.emplace(b[i] - c, 0);
        if (a[i] + pq.top().first <= 0) {
            cnt += pq.top().second == 0;
            cost += a[i] + pq.top().first;
            pq.pop();
            pq.emplace(-a[i], 1);
        }
    }
    return {cnt, cost + (long long) k * c};
};

int l = 0, r = 2e9 + 5;
while (l < r) {
    int m = midpoint(l, r);
    if (solve(m).first >= k)
        r = m;
    else
        l = m + 1;
}

cout << solve(l).second << "\n";

The priority queue stores negative cycles as a pair of cost and what type it is (swapping with existing matching or creating a new matching). Note that solve tries to maximize the number of matchings created for tiebreakers, which is necessary to avoid reconstruction issues with Aliens trick (touched on in more detail in the Aliens trick resource I linked at the top).


Notice that the final solution looks awfully similar to a greedy algorithm. And indeed, we can apply a greedy interpretation to this: each ai can be matched with some bj where ij for cost ai+bjc, and we want cost as negative as possible. Then the two types of negative cycles can be interpreted as “options” when introducing a new index i: either match ai with some unmatched bj or swap it with some existing (ak,bl) matching. In essence, the flow interpretation serves as a proof for the correctness of the greedy.

Do we need Aliens trick?

Maybe not, but the solution certainly gets significantly messier without it. If we don’t apply Aliens trick and instead analyze the network directly, we have an edge from t to s with (,k).

For types of negative cycles introduced upon adding a new index i, the two types described above still apply, but now saibjtbkals is also possible, because the ts edge no longer has infinite capacity, so it may not be possible to choose saibjts instead.

Furthermore, not only is saibjtbkals possible, but there are even more complicated types of cycles that could spawn now. Consider the following case:

1
2
3
5 3
9 1 10 10 4
6 7 8 2 9

After processing i=5,4,3, our matchings will be (10,8),(10,2),(4,9) for total cost of 43.

When we introduce i=2, we can no longer take any more matchings as k=3, so we will have to consider swapping with existing matchings instead. The most negative cycle available is swapping out (1,7) for (10,8) for cost 1+7108=10. This is a negative cycle of the third type that didn’t exist with Aliens trick, namely sa2b2tb3a3s. And so now we have (1,7),(10,2),(4,9) with total cost 33.

Now introduce i=1. The most negative cycle is sa1b4a4s with cost 910=1. So after swapping, we have matchings (9,2),(1,7),(4,9). However, we are not done. There’s still a long negative cycle that exists in this situation: tb2a2b4a1b1t with cost 7+6=1. What this does is this matches the 9 with the 6 instead and then matches the 1 with the 2 instead of the 7. So this converges to (9,6),(1,2),(4,9) with the minimum cost of 31.

If we were to do this in one negative cycle after adding i=1, it would be sa1b1tb2a2b4a4s with cost 9+6710=2. That’s absurd!

In short, when there is a capacity k restriction, it is no longer the case that the optimal solution containing i from 2 to n changes just a little bit when adding i=1. It could completely morph.


So with this first example, we can see two different solutions that arise from thinking about flows. The first one is probably impossible to think of without flows. For the second one, it might be difficult to prove Aliens trick can be applied here without formulating it as flow. Once Aliens trick is applied, the greedy is relatively easier to come up without thinking about flows, but having the flow network still gives us confidence in its correctness.

Codeforces 866D: Buy Low Sell High

Statement Summary: You have knowledge of the price of a stock over the next n days. On the ith day it will have price pi. On each day you can either buy one share of the stock, sell one existing share you currently have, or do nothing. You cannot sell shares you do not have. What is the maximum possible profit you can earn?

Constraints: 2n3105,1pi106


This problem has several interpretations that lead to the same code. You can think of it as slope trick. Or as greedy. Let’s think about it as flows.

Firstly, we can loosen the restrictions by allowing for both buying and selling a stock on a given day. This won’t change the answer since both buying and selling on the same day is equivalent to doing nothing.

Now we design the following flow network with n+2 nodes: a source s and sink t and n nodes representing days 1 through n. We add the following edges:

  • s to i for all 1in with (pi,1)
  • i to t for all 1in with (pi,1)
  • i to i+1 for all 1i<n with (0,)

The answer is thus the max cost flow of this network.

Image 5

Example for n=3

Now consider the cycle cancelling method. Add an edge from t to s with (0,). We are using a cost of 0 instead of because we do not require the max amount of flow, and forcing the max flow is not always optimal.

This network is extremely similar to the previous problem. We add the nodes from n to 1. When we add node i, the possible positive (since we are doing max cost flow) cycles are:

  • sijts for ij with cost pjpi
  • sijs for ij also with cost pjpi

So the code ends up being quite simple and may resemble some other greedy code you’ve seen.

Code
1
2
3
4
5
6
7
8
9
10
long long ret = 0;
priority_queue<int> pq;
for (int i=n-1; i>=0; i--) {
    pq.push(p[i]);
    if (pq.top() > p[i]) {
        ret += pq.top() - p[i];
        pq.pop();
        pq.push(p[i]);
    }
}

CSES Programmers and Artists

Statement Summary: You have n people, the ith person has xi amount of programming skill and yi amount of art skill. Each person can be assigned to be a programmer, artist, or neither. You need exactly a programmers and b artists. What is the maximum sum of skills attainable by some matching?

Constraints: 1n2105,a+bn,1xi,yi109


You get the drill now. Let’s model this problem as a flow network. Here is one possible construction using n+4 nodes.

  • s to i with (0,1)
  • i to prog with (xi,1)
  • i to art with (yi,1)
  • prog to t with (0,a)
  • art to t with (0,b)

The answer is the max cost max flow of this network. Notice that in this case the maximum flow is also optimal as all augmenting paths will have positive weight.

Image 6

Example for n=3

Let’s try approaching this with successive shortest path! Essentially, we repeatedly find an augmenting path from s to t with maximum cost and send 1 unit of flow through. After some pondering, we conclude we only have 4 types of paths to consider.

  • siprogt with cost xi
  • siartt with cost yi
  • siprogjartt with cost xixj+yj
  • siartjprogt with cost yiyj+xj

The first and fourth types of paths decrease the capacity of progt by 1, and the second and third types of paths decrease the capacity of artt by 1.

Do we need to consider any more complicated paths? Such as siprogjartkprogt? Turns out we don’t. Suppose path siprogjartkprogt has the maximum cost. The cost is xixj+yjyk+xk. Now consider a different augmenting path that would also be valid, siprogt with cost xi. The fact that the other path was chosen implies xixj+yjyk+xk>xi, or yjxj+xkyk>0. This implies that there exists a positive cycle jartkprogj before introducing our new augmenting path, contradicting the property of successive shortest paths which ensures optimal cost flow with each new augmenting path added.

Therefore, we just need to maintain some heaps to repeatedly find the maximum cost path out of those 4 options:

  • pair unmatched i to programmer
  • pair unmatched i to artist
  • pair unmatched i to programmer, and switch some existing programmer j to artist
  • pair unmatched i to artist, and switch some existing artist j to programmer

My code is a bit verbose but attempts to implement this idea as straightforwardly as possible:

Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
#include <bits/stdc++.h>
using namespace std;

#ifdef LOCAL
#define DEBUG(...) debug(#__VA_ARGS__, __VA_ARGS__)
#else
#define DEBUG(...) 6
#endif

template<typename T, typename S> ostream& operator << (ostream &os, const pair<T, S> &p) {return os << "(" << p.first << ", " << p.second << ")";}
template<typename C, typename T = decay<decltype(*begin(declval<C>()))>, typename enable_if<!is_same<C, string>::value>::type* = nullptr>
ostream& operator << (ostream &os, const C &c) {bool f = true; os << "["; for (const auto &x : c) {if (!f) os << ", "; f = false; os << x;} return os << "]";}
template<typename T> void debug(string s, T x) {cerr << "\033[1;35m" << s << "\033[0;32m = \033[33m" << x << "\033[0m\n";}
template<typename T, typename... Args> void debug(string s, T x, Args... args) {for (int i=0, b=0; i<(int)s.size(); i++) if (s[i] == '(' || s[i] == '{') b++; else
if (s[i] == ')' || s[i] == '}') b--; else if (s[i] == ',' && b == 0) {cerr << "\033[1;35m" << s.substr(0, i) << "\033[0;32m = \033[33m" << x << "\033[31m | "; debug(s.substr(s.find_first_not_of(' ', i + 1)), args...); break;}}

template<typename T>
struct FastSet {
    priority_queue<T> pq, pending;

    void add(T x) {
        pq.push(x);
    }

    void rem(T x) {
        pending.push(x);
        while (!pq.empty() && !pending.empty() && pq.top() == pending.top()) {
            pq.pop();
            pending.pop();
        }
    }

    T max() {
        assert(!pq.empty());
        return pq.top();
    }
};

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int a, b, n;
    cin >> a >> b >> n;
    vector<int> x(n), y(n);
    for (int i=0; i<n; i++)
        cin >> x[i] >> y[i];

    FastSet<pair<int, int>> pqA, pqB;
    priority_queue<pair<int, int>> pqA2, pqB2;
    for (int i=0; i<n; i++) {
        pqA.add({x[i], i});
        pqB.add({y[i], i});
    }

    long long ret = 0;
    while (a > 0 || b > 0) {
        int mx = 0;
        if (a > 0) {
            if (!pqA.pq.empty())
                mx = max(mx, pqA.max().first);
            if (!pqB.pq.empty() && !pqB2.empty())
                mx = max(mx, pqB.max().first + pqB2.top().first);
        }
        if (b > 0) {
            if (!pqB.pq.empty())
                mx = max(mx, pqB.max().first);
            if (!pqA.pq.empty() && !pqA2.empty())
                mx = max(mx, pqA.max().first + pqA2.top().first);
        }
        assert(mx > 0);

        ret += mx;
        if (a > 0) {
            if (!pqA.pq.empty() && mx == pqA.max().first) {
                int i = pqA.max().second;
                pqA.rem({x[i], i});
                pqB.rem({y[i], i});
                pqA2.emplace(y[i] - x[i], i);
                a--;
                continue;
            }
            if (!pqB.pq.empty() && !pqB2.empty() && mx == pqB.max().first + pqB2.top().first) {
                int i = pqB.max().second, j = pqB2.top().second;
                pqA.rem({x[i], i});
                pqB.rem({y[i], i});
                pqB2.pop();
                pqB2.emplace(x[i] - y[i], i);
                pqA2.emplace(y[j] - x[j], j);
                a--;
                continue;
            }
        }
        if (b > 0) {
            if (!pqB.pq.empty() && mx == pqB.max().first) {
                int i = pqB.max().second;
                pqA.rem({x[i], i});
                pqB.rem({y[i], i});
                pqB2.emplace(x[i] - y[i], i);
                b--;
                continue;
            }
            if (!pqA.pq.empty() && !pqA2.empty() && mx == pqA.max().first + pqA2.top().first) {
                int i = pqA.max().second, j = pqA2.top().second;
                pqA.rem({x[i], i});
                pqB.rem({y[i], i});
                pqA2.pop();
                pqA2.emplace(y[i] - x[i], i);
                pqB2.emplace(x[j] - y[j], j);
                b--;
                continue;
            }
        }
        assert(false);
    }

    cout << ret << "\n";

    return 0;
}

More Problems