(rev. 2022/02/07)
Notes On The Stable Matching Problem
Discussion of the Problem and the Gale-Shapley Algorithm
As our text says: "The Stable Matching problem originated, in part, in 1962 when David Gale and Lloyd Shapley, two mathematical economists, asked the question: Could one design a college admissions process, or a job-recruiting process, that was self-enforcing? ... Gale and Shapley proceeded to develop a striking algorithmic solution to this problem ... . ... this is not the only origin of the Stable Matching Problem. It turns out that for a decade before the work of Gale and Shapley, unbeknownst to them, the National Resident Matching Program had been using a very similar procedure, with the same underlying motivation, to match residents to hospitals. Indeed, this system, with relatively little change, is still in use today."
Looking at the Gale-Shapley algorithm, we can see that it seems to "try" to help members of both groups get desirable matches. A "man" proposes first to his most-preferred "woman" and he continues to propose to "women" in order of decreasing preference, until he becomes engaged to his final match. Also, a "woman" may be able to "trade up," by exchanging one match for a different, more preferred, match.
It is important, however, to understand that there is no guarantee that all the participants will be "equally satisfied" with the matching that the Gale-Shapley algorithm creates, and people who study the Stable Matching Problem say that matchings produced by the Gale-Shapley algorithm can be "unfair."
What is guaranteed about the Gale-Shapley algorithm is that it creates a stable matching of all the "men" and "women." That means it matches up all the "men" and "women" monogamously in such a way that there are no unstable pairs. The definition of an unstable pair is that it is a pair (m, w) consisting of a "man" m and a "woman" w who are NOT matched to each other, but who prefer each other to their matches. We can picture an unstable pair (m,w) as looking like this:
w'
/
/
m <--♥-->w
/
/
m'
The meaning of the diagram is that m and w mutually "love" each other more than their current matches. In other words, m prefers w more than his match, w', AND w prefers m more than her match, m'.
Now that we've had this discussion of the Stable Matching Problem, in the rest of this document, we're going to prove that the Gale-Shapley algorithm works. In other words, given a set of N women, N men, and their preference lists, the Gale-Shapley algorithm will find, without fail, a stable matching of the women and men.
Proof That The Gale-Shapley Algorithm Works
- 1. During the execution of the Gale-Shapley algorithm,
"once a woman has become engaged, she remains engaged."
This is true because the way the algorithm works, a woman starts out
free, becomes engaged to the first man who proposes to her, and thereafter
may get a new fiancé by "trading up," but never goes back to
being free.
(A technicality: If you think about how a program that implements the
algorithm might work, it may occur to you that when the program
implements a "trade up" and replaces the old fiancé with
the new fiancé, there could be a brief instant when the data
representing the woman doesn't show her engaged to anyone,
and so maybe in that instant she is free. We don't
need to be concerned with that. To do the proof here, all we
need to know is that after a woman is first engaged,
every time execution reaches the end of an iteration of
the algorithm loop, that woman is still engaged. That's
certainly true.)
- 2. If a man m has proposed to all the women, then all the women are engaged.
[Proof: When m proposed to a woman w, she was either already engaged or she was free
and accepted the proposal. Either way, she was engaged after m proposed to her,
and, by (1), she remained engaged.]
- 3. Claim: If all N women are engaged, then all N men are engaged.
[Proof: the way the algorithm works, 1) a man has to PROPOSE to a woman
to become engaged, 2) he has to be FREE before he can propose, and
3) he can only propose to one woman at a time. Therefore it is impossible
for a man to be engaged to more than one woman at a time.
Therefore, when all N women are engaged, no two of them are engaged
to the same man, and so, the N women are engaged to N different men.
Because there are only N men, the N different men that the women are engaged to
are all the men. This proves the claim:
If all the women are engaged, then all the men are engaged.]
- 4. Claim: If all N men are engaged, then all N women are engaged.
[Proof: The way the algorithm works, a woman can only be engaged
to one man at a time. That's clear because the algorithm provides only
two ways for her to form an engagement. In the first case, one man may
propose to a woman when she is free, and she becomes engaged, just to that
one man. In the other case, she is already engaged and she trades up. In
that case she is engaged to just one man both before and after the trade.
Now that we have established that a woman can only be engaged to
one man at a time, we can use logic exactly parallel to what
we used to prove (3), we see that when all the men are engaged
they are engaged to N different women, and the N different women
are all the women, because there are only N women.
That proves the claim.]
- 5. Claim: If a man has proposed to all the women, then he is engaged.
[Proof: By (2), if a man, m, has proposed to all the women,
the women are all engaged. By (3), we can then conclude that all the men, including m,
are engaged.]
- 6. Claim: If a man is free, he has not proposed to all the women.
[Proof: This is logically equivalent to (5). A free man CAN'T have proposed to all
the women. He wouldn't be free if he had proposed to all the women.]
- 7. Claim: The loop condition of the Gale-Shapley algorithm
A) "There is a free man who has not proposed to all the women"
is logically equivalent to
B) "There is a free man"
[Proof: Obviously A logically implies B. If there's a free man who has
not proposed to all the women, then clearly there's a free man. On the other hand,
B logically implies A because if there's a free man, by (6), he has not proposed
to all the women.]
- 8. Claim: The stopping condition of the Gale-Shapley algorithm
is logically equivalent to
Stopping Condition: "Every man is engaged."
[Proof: The stopping condition is just the logical negation of the loop condition, and
"Every man is engaged" is clearly the logical negation of "There is a free man."
- 9. In each iteration of the loop, there is a proposal of a man to a woman.
When a man proposes, it is always to a woman to whom he has not proposed before,
so each proposal is new, never a repeat, never the same man
proposing to the same woman. The total number
of different possible proposals is N2, which is the number
of ordered pairs (m, w) where m is one of the men, and w is one of the women.
- 10. If a time comes when the algorithm loop has iterated N2 times,
then every possible proposal will have been made, and so
every man will have proposed to every woman. By (5),
that means that every man will be engaged, which is the condition that
stops the loop.
This shows that, no matter what the input is to the algorithm, it will stop after no
more than N2 iterations of its main loop.
(Incidentally, it has been proved that the loop always stops after
fewer than N2 iterations, although the number
can be quite close to N2.)
Now it's clear that the loop never becomes an infinite loop,
and moreover we now know something about how
efficient the algorithm
may be. ( O(N2) loop iterations )
- 11. Claim: When the Gale-Shapley algorithm stops, all the men and women
are engaged.
[Proof: The stopping condition is "Every man is engaged" and we know
from (4) that all the women are engaged if all the men are engaged.
That proves the claim.]
Corollary to the Claim: The Gale-Shapley algorithm always creates
a perfect matching of the men to the women.
[Proof: This follows directly from the definition of a perfect matching.
As we've already observed, we can tell from reading the algorithm
that the Gale-Shapley algorithm always creates a monogamous matching
of some of the men to some of the women. Just above, we proved
that it matches all the men to all the women.
Therefore it is a perfect matching - it monogamously
matches all the men to all the women.]
The next thing we want to find out is whether this perfect matching
that we get from the Gale-Shapely algorithm is guaranteed to be
stable. Consider any pair of the form (m, w) where
m is one of the men, w is one of the women, and m and w are NOT matched
to each other. Just to make it easier to discuss them, let's call the man
and woman David and Alice. Further, let's assign names to the matches that
were assigned to David and Alice:
- 12. Alice is matched to Bill, and
- 13. Charlotte is matched to David.
Is it possible that (David, Alice) is an unstable pair? In other words,
is it possible that both the following things are true?
- 14. Does David like Alice better than his match, Charlotte?
- 15. Does Alice like David better than her match, Bill?
If so then Alice and David both have an incentive to 'run off'
with each other. This would be an instability, an unstable pair
of the matching.
- 16. Well, suppose that 12, 13, and 14 are true. Consider the following
logical consequences of that.
- 17. Because 14 is true (David likes Alice better than Charlotte),
Alice is higher on David's list than Charlotte,
and so David must have proposed to Alice before he proposed
to Charlotte.
- 18. By 12, Alice is not with David now, and because of
the way the algorithm works, she must have rejected David. Either she
turned David down when he proposed to her (because she was already
engaged to someone higher on her list), or she was once
engaged to David, and then dumped him to "trade up."
- 19. Either way, after Alice rejected David, she was engaged to
someone, mr. X, whom she liked more than David. After that,
she never "traded down." The only possibility is that she
did some more "trading up."
- 20. Because of the reasoning in step 19, above, we can conclude that
Bill, the guy with whom Alice ended up, is someone she likes more than
David.
- 21. To sum up the logic, if 12, 13, and 14 are true, then
"Alice does not like David more than Bill."
In other words, if 12, 13, and 14 are true, then 15 is false.
- By showing that "if 12, 13, and 14 are true, then 15 is false,"
we have shown that 12, 13, 14, and 15 can't all be true.
This proves that the algorithm
creates a matching with no instabilities.
- So, we've reached an important milestone! We have now proved
that the Gale-Shapley algorithm really works. No matter
what the inputs are, the algorithm always outputs a stable matching,
which, by definition is a perfect matching
with no unstable pairs.
- Our text goes on to define what it means for a man and woman
(m, w) to be valid partners. It means there exists
a stable matching in which m and w are one of the matches.
It is important to understand that there may be stable matchings
that cannot be created by the Gale-Shapley algorithm.
Keep that in mind when you think about
valid partners. A man and woman (m,w) are valid partners if there is
any stable matching that matches those two with each other.
- After defining valid partners, the text proves that
22. The Gale-Shapley algorithm always matches each man with his
"best valid partner"
- the valid partner who is highest on his preference list.
- Make sure you understand the idea of a "best valid partner."
It definitely does not mean "the first choice."
Given a set of n men and n women, and all their preference lists,
there are many different ways to make a perfect matching. We know
that at least one of those prefect matchings is stable
(the output of the Gale-Shapley
algorithm). There could be more than one stable matching for
those men and women and their preferences. How many stable matchings
there are depends on exactly what the preferences are.
It's possible that a given pair (m,w) are not
matched in any of the stable matchings that may exist for the 2n people
and their preferences.
In that case,
we say that m and w are not valid partners. For every man m, there is
at least one valid partner (the one he gets
from the Gale-Shapley algorithm). Maybe there are more valid partners of
m that he gets in other stable matchings. The best valid
partner of m is whichever of his valid partners who is highest on his
list. That best valid partner could be the actual first woman
on m's list, but she could be a woman further down the list,
because the first woman on the list may not be a valid partner
for m. (Maybe there's no stable
matching in which m is matched up with the first woman on his list.)
It's even possible that
m's best valid partner is the last woman on his list! Depending
on what exactly the preferences are, it could work out that way!
- Anyway, it seems pretty amazing that the Gale-Shapley algorithm
simultaneously gives all the men the "best" match
they could hope to receive in any stable matching.
Also this fact shows that the algorithm always gives the same
matching as its output, no matter what particular men are chosen
during the execution to make the next proposal! No matter what,
the algorithm will match each guy to his one and only
best valid partner!
The central idea of the proof is to demonstrate that when the algorithm runs
no man is rejected by a valid partner.
- The text also proves that
23. The G-S algorithm pairs each woman with her worst
valid partner.
- I made proofs of 22 and 23 that are somewhat different from the proofs
in our text. Those proofs can be found in the same list of links where
the document you are reading is found. If you want to read the proofs, you can
click where it says: "Proof that the Gale-Shapley Algorithm Assigns
Best Valid Partners to Worst Valid Partners (HTML)."
At the time of this writing, the URL of the document is:
https://www.cs.csustan.edu/~john/Classes/CS4440/Notes/01A_StableMatch/BVP_pf.html