Unpredictably Rational : repeated game

Showing posts with label repeated game. Show all posts

2015-11-10

SlideShare

I've uploaded lecture slides at SlideShare. They are written in English or Japanese. The below are the titles (+ links) of the slides that are written in English. It would be great if you could take a look!

2015-05-02

Kandori and Matsushima (1998)

Original article (link) posted: 12/12/2005

The following is a short version of An essay on Kandori and Matsushima (1998) "Private Observation, Communication and Collusion" (Econometrica, 66), the term paper of the class by Professor Dutta.

Summary
The present paper analyzes the role of communication and the possibility of cooperation in a long term relationship, when the actions of the players are imperfectly observed and each player receives only private signal.
The analysis of such a situation, a repeated game with (imperfect) private monitoring, is known to be a hard problem in game theory due to its fairly complex mathematical structure, particularly due to the lack of common information shared by players. Under private monitoring, the distribution of the private histories is no longer common knowledge after a deviation (off the equilibrium path), because only the deviator takes her deviation into account for up-dating her belief while other players cannot. This means the continuation play off the equilibrium path is not even an equilibrium of the original game. Therefore, the recursive structure found in the public monitoring case, i.e., the property that the continuation payoff after any history is chosen from the identical set of equilibrium payoffs, is destroyed under private monitoring (hence, we cannot apply dynamic programming techniques provided by Abreu, Pearce and Stacchetti (1990)).
Indeed, in sharp contrast to the well-explored case of repeated games under public information (with the celebrated Folk Theorems by Fudenberg, Levine and Maskin (1994)), little had been known about the private monitoring case until recently. This is unfortunate because this class of games admits a wide range of applications such as collusion under secret price-cutting, exchange of goods with uncertain quality, and observation errors.
In this paper, the authors introduce communication in the model with private monitoring to overcome the basic difficulty of this subject. Namely, they assume that at the end of each period players can communicate what they privately observed. The announced messages generate publicly observable history, and the players can play different equilibria depending on the history of communication. Facilitating communication as a coordination device, the authors construct equilibria in which players reveal their private information truthfully, and show that the folk theorem obtains under a set of mild assumptions.
Finally, it should be noticed that their results provide a theoretical support for the conventional wisdom that communication plays an important role in sustaining collusion.

Introducing Communication
To overcome the difficulties associated with private monitoring, the authors introduce communication which generates publicly observable history, and enables players to play different equilibria depending on the history of communication. Thanks to this communication, the recursive structure is recovered, and one can apply the results in previous literature, e.g., the characterization of equilibrium payoff sets provided by Abreu, Pearce and Stacchetti (1990) and Fudenberg and Levine (1994), or Folk Theorems given by Fudenberg, Levine and Maskin (1994).

Main Result
Folk theorem can obtain given Condition 1-3 listed below.

Condition 1
If player j has a perfectly undetectable deviation at the minimax point for player i, j has no incentive to take it.
Condition 2
If either player i or j (but not both) deviates with certain probabilities from a pure action profile wich generales an extremal point, the other players can statistically detect it.
Condition 3
The players other than i and j can statistically discriminate player i's (possibly mixed) deviations from player j's at any pure action profile wich generales an extremal point.

Remark 1
These conditions guarantee that every Pareto-efficient profile and each minimax strategy profile are enforceable, which is sufficient to establish the following Folk Theorem. Roughly speaking, Condition 2 and 3 correspond to the pair-wise identifiability condition and Condition 1 is replaced with the individually full-rank condition in Fudenberg, Levine and Maskin (1994). The former is sufficient for Nash-threat version of the Folk Theorem, while the latter, in addition to the former, is sufficient for minimax version of the Folk Theorem.

Folk Theorem
Suppose that there are more than two players (n>2) and the information structure satisfies Condition 1-3. Also suppose that the dimension of V is equal to the number of players. Then, any interior point in the set of feasible and individually rational payoffs can be achieved as a sequential equilibrium average payoff profile of the repeated game with communication, if the discount factor is close enough to 1.

Basic Idea
In their communication model, one must induce each player to reveal her signal truthfully. To do so, they consider the equilibria in which each player's future payoff is independent of what she communicates. If this is the case, she is just indifferent as to what she says, and truthful revelation becomes a (weak) best response.
As one might expect, this can be done if there are at least three players and the information structure can distinguish different players' deviations. A player's private information can be used to determine when and how to transfer payoffs among other players. (In two player case, this transfer is no longer available and hence the punishment of the other player necessarily invites welfare loss.)

Remark 2
Roughly speaking, efficiency under publicly observable signals can be achieved if players can be punished by "transfers". If the information structure allows us to tell which player is suspect, we can transfer the suspect player's future payoff to the other players. This can provide the right incentives without causing welfare loss, compared to the case where all players are punished simultaneously.
Remark 3
The authors also examine the possibility of providing strict incentives to tell the truth. It is shown that when private signals are correlated, there is a way to check if each player is telling the truth and we can construct the equilibria in which the players have strict incentives for truth telling.
Remark 4
If there are two players, or if the information structure fails to distinguish different players' deviations, the above idea cannot be utilized. However, even in such cases, the Folk Theorem can be obtained by infrequent communication. This is based on the idea of Abreu, Milgrom and Pearce (1991) that delaying the release of information helps to achieve efficiency.
Remark 5
To analyze the equilibria, they employ the method developed by Fudenberg and Levine (1994). Instead of directly solving the repeated game, this method first considers simple contract problems associated with the stage game. Then, the solutions to those contract problems are utilized to construct the set of equilibrium payoffs of the repeated game.

Conclusion
As I explained in Summary, the characterization of equilibria in repeated games with private monitoring have been an open question, because the games lack recursive structure and are hard to analyze. The present paper shows that communication is an important means to resolve possible confusion among players in the course of collusion during repeated play. Namely, as they introduce communication to generate publicly observable history, the authors recover the recursive structure and show a Folk Theorem.
However, it should be noticed that they did not show the necessity of communication for a Folk Theorem. In principle, there is a possibility that a Folk Theorem holds even without communication. Indeed, the analyses of private monitoring have been rapidly developed (not yet achieve the complete characterization of equilibrium, though) since Kandori and Matushima (1998). Therefore, I would like to mention the recent literature on repeated games with private monitoring, which concludes this essay (I relied on the excellent survey by Kandori (2002) for the remain part).

Recent Literature
Sekiguchi (1997) is the first paper to construct an equilibrium which is apart from the repetition of the stage game equilibrium under private monitoring. He shows that efficiency can be approximately achieved (without communication) in the prisoner's dilemma model, if the information is almost perfect.
Bhaskar and Obara (2002) extend Sekiguchi's construction to support any point Pareto dominating (d,d) (in the prisoner's dilemma), when monitoring is private but almost perfect. Sekiguchi-Bhaskar-Obara type of equilibrium is called "belief-based" equilibrium because they facilitate the coordinated punishment idea.
Piccione (2002) and Ely and Valimaki (2002) introduce a completely different, useful technique to support essentially the same area under almost perfect monitoring. In contrast to "belief-based" equilibrium by S-B-O, their equilibrium utilizes the uncoordinated punishment idea, and hence is named "belief-free" equilibrium.
Matsushima (2004) extends Ely and Valimaki's construction and show that their Folk Theorem continues to hold even if monitoring is far from perfect, as long as private signals are distributed independently.
Ely, Horner and Olszewski (2005) give the most general results in two-player repeated games with private monitoring. Using "belief-free" strategies, they provide a simple and sharp characterization of equilibrium payoffs. While such strategies support a large set of payoffs, they are not rich enough to generate a Folk Theorem in most games besides the prisoner's dilemma, even when information is almost perfect.

References
Abreu, Milgrom and Pearce (1991) "Information and timing in repeated partnerships" Econometrica, 59
Abreu, Pearce and Stacchetti (1990) "Toward a Theory of Discounted Repeated Games with Imperfect Monitoring" Econometrica, 58
Bhaskar and Obara (2002) "Belif-based Equilibria in the Repeated Prisoners' Dilemma with Private Monitoring" Journal of Economic Theory, 102
Ely, Horner and Olszewski (2005) "Belief-free Equilibria in Repeated Games" Econometrica, 73
Ely and Valimaki (2002) "A Robust Folk Theorem for the Prisoner's Dilemma" Journal of Economic Theory, 102
Fudenberg and Levine (1994) "Efficiency and Observability with Long-Run and Short-run Players" Journal of Economic Theory, 62
Fudenberg, Levine and Maskin (1994) "The folk theorem with imperfect public information" Econometrica, 62
Kandori (2002) "Introduction to Repeated Games with Private Monitoring" Journal of Economic Theory, 102
Matsushima (2004) "Repeated Games with Private Monitoring: Two Players" Econometrica, 72
Piccione (2002) "The Repeated Prisoner's Dilemma with Imperfect Private Monitoring" Journal of Economic Theory, 102
Sekiguchi (1997) "Efficiency in the Prisoner's Dilemma with Private Monitoring" Journal of Economic Theory, 76

2014-05-20

Athey et al. (2005)

Original article (link) posted: 07/11/2005

Athey, Atkeson and Kehoe (2005) "The Optimal Degree of Discretion in Monetary Policy" Econometrica

The paper examines a monetary policy game in which the monetary authority has private information about the state of the economy. In the literature, two seminal papers, Taylor (1983) and Canzoneri (1985), established no discretion should be left to the monetary authority if there is no such private information; the best outcomes can be achieved by rules that specify the action of the monetary authority as a function of observables. The introduction of private information creates a tension between discretion and time inconsistency. Tight constraints on discretion mitigate the time inconsistency problem in which the monetary authority is tempted to claim repeatedly that the current state of the economy justifies a monetary stimulus to output. However, tight constraints leave little room for the monetary authority to time-tune its policy to its private information. In short, loose constraints allow the monetary authority to do that fine tuning, but they also allow more room for the monetary authority to stimulate the economy with surprise inflation.

Making use of dynamic mechanism design techniques, they find that the optimal mechanism is quite simple; for a broad class of economics, the optimal mechanism is static (policies depend only on the current report by the monetary authority) and can be implemented by setting an inflation cap, an upper limit on the permitted inflation rate. It is also shown that the optimal degree of monetary policy discretion turns out to shrink as the severity of the time inconsistency problem increases relative to the importance of private information.

Comment
The results they derive seem to be very interesting. In particular, the “inflation cap” result is directly related to the optimal inflation targets and has strong importance with actual monetary policies. As they mention, their work provides one theoretical rationale for the type of constrained discretion advocated by Bernanke and Mishkin (1997). This paper might become a must-read paper for those who work in monetary policies.

References
Bernanke and Mishkin (1997) "Inflation Targeting: A New Framework for Monetary Policy?" J. of Econ, Perspectives, 11
Canzoneri (1985) "Monetary Policy Games and the Role of Private Information" AER, 75
Taylor (1983) "Rules, Discretion, and Reputation in a Model of Monetary Policy: Comments" JME, 12

2011-06-27

Two Papers on Repeated Games

Original article (link) posted: 31/10/2005

Sorin (1986) “On Repeated Games with Complete Information” Math. Of Operations Research, 11-1

Several properties of the sets of feasible payoffs for repeated games are shown. Particularly, the condition that the set of feasible payoffs are convex hull of the feasible payoffs in pure strategies is given. Namely, it is necessary and sufficient that a discount factor is larger than or equal to 1-1/N, where N is the number of the players.

Dal-Bo (2001) “Tacit Collusion under Interest Rate Fluctuations” Job Market Paper, UCLA

The paper examines the optimal tacit collusion equilibrium when the discount factor changes over time. It is shown that collusive prices and profits depend not only on the level of the discount factor but also on its volatility; they increase with a higher discount factor level and decrease with its volatility. The model is a variant of Rotemberg-Saloner model, where, instead of demand fluctuation, the discount factor is assumed to fluctuate.

2011-02-19

Lecture 7 (Dutta): Repeated Games 2

Original article (link) posted: 25/10/2005

We continued to examine the Abreu-Pearce-Stacchetti (APS) operator, particularly focusing on the following two theorems.

Theorem1 (Necessity)
V* = LV*

Theorem2 (Sufficiency)
If V = LV (and V is bounded), then V is a subset of V*

where L is APS operator and V* is the set of SPE payoffs of the repeated game.

The proof of Theorem 1 is not difficult. We used "unimprovability" to prove Theorem 2. APS operator also establishes following results.

1. V* is compact
2. V* is increasing in the discount factor
3. APS operator is monotone

Using the third result with two theorems mentioned above, we can derive the algorithm to compute SPE payoffs. That is, starting with a large set of candidate equilibrium payoffs (say, a convex hull of the set of feasible payoffs), we just need to apply the APS operator iteratively until the sequence of sets will converge. Then, the limit must coincide with V*.

2011-02-05

Lecture 6 (Dutta): Repeated Games 1

Original article (link) posted: 21/10/2005

Repeated Games: Set-Up
We first checked the definitions of the followings; a stage game, a repeated game, a subgame, strategies, histories, a Nash equilibrium, a subgame perfect NE, feasible payoffs, and individually rational payoffs.
Note) Any points in the convex hull of the pure strategy payoffs are feasible when the discount factor is sufficiently large. (The proof is done by using time-averaging strategies. See Sorin(1986))

Abreu-Pearce-Stachetti Characterization
Then, we investigated APS operator, which captures the similar idea of Bellman operator in a single-agent dynamic optimization problem.
Since this blog is not designed for writing messy equations, I will not cover the mathematical argument about APS operator here. You can check the chapter 5 of Fudenberg and Tirole (1991) ("Dynamic Programming and Self-Generation" in 5.5.4) or the original paper by APS (1990).

References
Abreu, Pearce and Stachetti (1990) "Toward a Theory of Discounted Repeated Games with Imperfect Monitoring" Econometrica, 58
Sorin (1986) "On Repeated Games with Complete Information" Math. of Operations Research, 11-1

2010-11-10

Kandori (1991)

Original article (link) posted: 01/10/2005

Kandori (1991) "Correlated Demand Shocks and Price Wars During Booms" RES, 58

The paper extends the analysis of Rotemberg and Saloner (1986) to the case of serially correlated demand shocks and derives the same counter-cyclical movement as in their case (i.i.d. case), provided the discount factor and the number of the firms satisfy certain relationship.
The key observation in Rotemberg and Saloner (1986) was that, if the sum of future profits is unaffected by today’s demand, firms must set the price relatively low when demand is high. The premise is clearly satisfied when the demand shocks are i.i.d.. This paper shows introducing Markov demand shocks also create the same situation in the following two cases. The first case is when the discount factor delta exceeds, but is close to (N-1)/N, where N is the number of the firm. It is shown that firms maintain a constant profit (which equals to the monopoly profit in the worst state) under all demand conditions. Therefore, the extent of the correlation in demand is irrelevant.
The second case arises when delta tends to unity while (1-delta)N is held constant. In this case, firms are enormously forward-looking and total future profit is mostly determined by the stationary distribution, which is independent of today’s demand position.

The result itself is not that surprising (comparing to the other papers by Kandori at least). However, he is amazingly good at selling his work, especially in the following two points;
First, he stresses the importance of Rotemberg and Saloner (1986) and their drawbacks as well. The motivation of extension of their paper becomes very clear and the reader necessarily gets interested in HIS work.
Second, his way of illustrating results is quite lucid and rigorous. Although, the results can be more or less expected to hold, it always is difficult to prove them in rigorously.
Those techniques should be useful for us. Let’s learn them by the papers by Kandori!

Interesting Papers that cite Kandori (1991)

Bagwell (2004) "Countercyclical Pricing in Customer Markets" Economica, 71
Bo (2001) "Tacit Collusion under Interest Rate Fluctuations" Job Market Paper
Harrington (2004) "Cartel Pricing Dynamics in the Presence of an Antitrust Authority" Rand

2010-09-13

Kandori (RES 1992)

Original article (link) posted: 26/09/2005

Kandori (1992) "Social Norms and Community Enforcement" RES, 59

The paper considers self-enforcing mechanisms in the situation where agents change their partners over time. Technically, the model in the paper is an extension of the theory of repeated games to the case of matching games. As main results, the following two are shown.

1) Community can sustain cooperation even when each agent knows nothing more than his personal experience.
2) Community can realize any mutually beneficial outcomes when each agent carries a label which is revised in a systematic way. That is, Folk Theorem holds.

As a motivation of the research, he mentions the benefit of specialization. After introducing theoretical achievements in personal enforcement mechanisms, i.e. Folk Theorem in the repeated game literature, he says as follows.

However, many important transactions are infrequent in nature. As economic historians argue, the division of labor and specialization are important driving forces of economic progress. Potential gains are larger in diverse transactions with different specialists than with fixed partners. Therefore, the control of incentives in such an infrequent trade is of vital importance to understand the organization of economic transactions.

He refers to two papers which initiated this line of research.

The attempt to generalize the Folk Theorem of repeated games to the case of matching games was initiated by Milgrom, North and Weingast (1990) and Okuno-Fujiwara and Postlewaite (1989). The former analyzed concrete examples of information transmission mechanisms and the latter introduced the notion of local information processing. Both of them, however, mainly deal with the infinite population case to avoid potentially complicated problems of incentives on off-equilibrium paths. Our paper shows that such problems can be resolved in a simple way if the stage game satisfies weak condition. Equilibria constructed in our paper work for any population size and any matching rule, and are robust to changes in information structures.

What a strong result he derived!! Although he does not stress the results given in Section 3 "Folk Theorem under Public Observability", I think Proposition 2 is very interesting. It is easy to understand that Folk Theorem holds if all the other players get into punishment phase after some player deviates, which is stated as Proposition 1. However, if we restrict our attention to such a situation where only the deviator is to be punished and innocent pairs are to play the originally prescribed actions, to show Folk Theorem is not straight forward. To be more precise, to check the incentives for innocent players in off-equilibrium path where community is highly populated with "guilty" agents is involved.
Introducing some "forgiveness" in the social norm, the author elegantly shows this problem can be avoided which leads to Proposition 2.

Interesting Papers in References

Harrington (1989) "Cooperation in Social Settings" mimeo
Section 7 of the above paper was revised and available as the following.
Harrington (1995) "Cooperation in a One-Shot Prisoners' Dilemma" GEB, 8
Milgrom, North and Weingast (1990) "The Role of Institutions in the Revival of Trade: The Law Merchant, Private Judges, and the Champagne Fairs" Economic Inquiry, 25
Okuno-Fujiwara and Polstlewaite (1995) "Social Norms in Random Matching Game" GEB, 9
Rubinstein and Wolinsky (1990) "Decentralized Trading, Strategic Behavior and the Walrasian Outcome" RES, 57

2010-07-04

Microeconomic Analysis - G6218

Original article (link) posted: 13/09/2005

Today, I attended the first class of G6218 which is an advanced microeconomics for upper class students by Professor Dutta. The course covers basic issues in dynamic economics and consists of the following three parts.

1) Dynamic Programming: The theory of single-agent models
2) Repeated/Dynamic Games: The theory of multi-agent models
3) Application: A strategic analysis of climate change

Although there is no required textbook, the two books listed below seems to be useful.

Stokey, Lucas and Prescott (1989) "Recursive Methods in DYnamic Economics" for (1)
Fudenberg and Tirole (1991) "Game Theory" for (2)

Since professor Dutta is very good at teaching and the topics are interesting for me and deep in mathematics, I am thinking to take this course. It should be better to bring SLP from Princeton... (I asked my friends in Princeton to keep my books.)

2010-06-23

Fudenberg, Levine and Maskin (Econometrica 1994)

Original article (link) posted: 08/09/2005

In their paper, "The Folk Theorem with Imperfect Public Information", several versions of Folk Theorem are shown. This is a memo for them. Before mentioning actual theorem, let's briefly check their contribution in the literature.

An important hypothesis of the standard Folk Theorem is that the players can observe one another's actions in each repetition, so that deviations from equilibrium strategies are detectable. In contrast, this paper considers games in which players observe only a public outcome that is a stochastic function of the actions played. Thus these are games of moral hazard. The major task of the paper is to provide conditions sufficient for the Folk Theorem to extend to such games. The most important hypotheses concern the way the probability distribution over public outcomes depends on the player's actions.

To see the Folk Theorem with perfect information, you should check Fudenberg and Maskin (Econometrica 1986) "The Folk Theorem for Repeated Games with Discounting and Incomplete Information". (Notice that "Incomplete Information" in the title implies the cases without a public randomization device or some reputation models and do not mean imperfect monitoring cases, which is covered in FLM)

The three versions of Folk Theorem they showed are as follows:

Nash-thread 1
If all pure-action profiles are pairwise identifiable, a "Nash-threat" version of the Folk Theorem obtains: any payoff vector Pareto-dominating a Nash equilibrium of the stage game can be sustained in an equilibrium of the repeated game for discount factors near enough to 1.

Nash-thread 2
If a game has at least one (mixed-action) profile satisfying the conjunction of pairwise identifiability and individual full rank (= "pairwise full rank"), then again the Nash-threat Folk Theorem applies. Generic games possess such a profile provided that the number of possible public outcomes is no less than the total number of elements in the action sets of any two players.

Minimax-thread
To obtain the conventional "minimax-threat" Folk Theorem requires more stringent conditions. Specifically, besides the hypotheses for the Nash-threat theorem, it suffices to assume that all pure-action profiles satisfy individual full rank.

Finally, I quote their explanation about two examples of inefficient results shown by Radner-Myerson-Maskin and Green-Porter (or Abreu-Pearce-Stacchetti)

Our work makes clear that the R-M-M counterexample relies on there being relatively few possible public outcomes compared to the number of possible actions, so that the genericity result mentioned before does not apply, and that equilibria of the G-P and A-P-S variety are necessarily inefficient only because they are symmetric: if (even slightly) asymmetric equilibria are admitted, the Folk Theorem is restored.

2010-06-15

Information in Repeated Games

Original article (link) posted: 31/08/2005

Let's think about Green and Porter model.
What happens if demand becomes less stochastic so that the firms have better obserbability on their output??

Kandori (1992) "The Use of Information in Repeated Games with Imperfect Monitoring"(RES 59) answers this question in a general framework.

In the paper, using the concept called "quasi-garbling" which is Blackwell's definition of informativeness, Kandori elegantly shows that

in the general model of discounted repeated games with imperfect monitoring, the set of payoffs attainable via pure-strategy sequential equilibria becomes larger (in the sense of set inclusion) as the observability of the past actions increases. The intuition behind the assertion is that the more accurately "cheating" is detected, the easier it is to enforce coordination.

Thus, the answer to the first question is

in an oligopoly model of the Green-Porter type, the best symmetric equilibrium becomes better and the most severe symmetric punishment (the worst equilibrium) becomes more severe, as the demand becomes less noisy.

The papers written by Kandori are all clear and elegant! How come he could write so many great papers... Every time I read his paper, I feel losing my confidence and realize how far where he was from where I am standing now :(

P.S.
Professor Kandori was my senior thesis advisor at University of Tokyo.

2010-06-12

A note on Abreu (1986)

Original article (link) posted: 20/08/2005

There has been no systematic attempt to study the maximal degree of collusion sustainable by credible threats for arbitrary values of the discount factor. In view of the motivation for moving from static to repeated models, this has some claims to being the essential question at issue.

As argued by Abreu (1983, Ph.D thesis), the fundamental determinant of the limits of collusion is the severity of punishments with which potential deviants from cooperative behavior can credibly be threatened. Accordingly, this paper concentrates principally on characterizing strategy profiles which yield optimal (in the sense of most severe) punishments.

A particular class of paths called two-phase punishments plays a central role. A two-phase punishment is symmetric; in addition it is stationary after the first period, i.e., in the second phase. I show that the optimal two-phase punishment is:
(1) Globally optimal for a certain range of parameter values.
(2) An optimal symmetric punishment.
(3) More severe than Cournot-Nash reversion.
(4) Easily calculated. (It is completely characterized by a pair of simultaneous equations.)
(5) The second phase of the optimal two-phase punishment is the most collusive symmetric output level which can be sustained by the optimal two-phase punishment itself.
...
(5) says that the optimal two-phase punishment consists of a stick and carrot; furthermore, the carrot phase is the most attractive collusive regime which can credibly be offered when optimal two-phase punishments are used to deter defections. Thus when we solve for optimal two-phase punishments, we simultaneously determine the maximal degree of collusion sustainable by optimal symmetric punishments. It is worth remarking that the stick-and-carrot property is not a curiosum, but arises naturally from the structure of the problem.

Optimal asymmetric punishments have a rather complicated structure and thus far elude description as complete as that provided for optimal symmetric punishments in the earlier section.

(All quoted from Abreu (1986))

I would like to say something about asymmetric cases below.
For delta large enough, optimal symmetric punishments yield 0 payoff hence they are the most severe punishment (notice that a firm's minmax payoff in the component game is zero). And in this case, all firms simultaneously minmax one another in the first phase of the punishment.
However, if an optimal symmetric punishment yields firms positive payoffs (more than their minmax payoffs), then it is not globally optimal.
Abreu gives characterizations of optimal asymmetric punishments. But they are complicated and much less sharper than those with symmetric cases.

2010-06-09

Two essential papers by APS

Original article (link) posted: 20/08/2005

What's the difference between APS (1986) and APS (1990)? The distinction is similar to that of Abreu (1986) and Abreu (1988). The earlier papers (both published in JET), APS (1986) and Abreu (1988), analyze the optimal strategies in actual oligopoly models by using powerful properties in repeated games established by them. The later papers (both in Econometrica) extend the results in generalized situations. Although these two Econometrica papers are more general and sophisticated than those of JET, they may have some drawbacks such as, lack of motivation, too concise explanation (to understand), and more seriously, showing no actual optimal strategies. Two papers in JET are strongly motivated by the actual collusion problems in IO, and hence you can see how useful the properties they found are.

Two APS papers heavily rely on the technique used in dynamic programming. Using those technique, they reduce the repeated game to a static structure from which can be extracted the optimal equilibria in question. That is,they construct a new game by truncating the discounted supergame as follows:
after each first-period history, replace the sequential equilibrium successor by the payoffs associated with that successor.

APS (1990) summarize the distinction of the two papers.

First, it relaxes the restriction of symmetry, showing the theory capable of embracing both asymmetric equilibria of symmetric games and arbitrary asymmetric games. Secondly, the sufficiency of using bang-bang reward functions in efficiently collusive equilibria is strengthened to a necessity theorem. Finally, we provide an algorithm useful in computing the sequential equilibrium value set.
(APS (1990), p.1044)

Finally, I would like to remind you of the difference between Abreu and APS. These papers are essentially different in the sense that the former analyze perfect monitoring cases whereas the latter consider imperfect monitoring cases. Moreover, even though all these papers are inspired by dynamic programming technique, their focus is different more or less. The argument in Abreu is about equilibrium paths, but that in APS is about equilibrium payoff sets.

References

Abreu (1986) "Extremal Equilibria of Oligopolistic Supergames" JET, 39
Abreu (1988) "On the Theory of Infinitely Repeated Games with Discounting" Econometrica, 56
Abreu, Pearce and Stacchetti (1986) "Optimal Cartel Equilibria with Imperfect Monitoring" JET, 39
Abreu, Pearce and Stacchetti (1990) "Toward a Theory of Discounted Repeated Games with Imperfect Monitoring" Econometrica, 58

2010-06-07

Price Rigidities

Original article (link) posted: 16/08/2005

In reality, prices cannot be adjusted continuously.
...
On the demand side, past prices may affect the firms' current goodwill through consumers' learning about the good or switching costs. On the supply side, past prices affect current inventories.
...
The presence of price rigidities raises the possibility that price reactions are not bootstrap reactions but are simply attempts to regain or consolidate market share.
(Tirole (1988), p.253-4)

In Maskin and Tirole (1988), they assume two firms choose their prices asynchoronously, at odd (even) periods, firm 1 (2) chooses its price. They consider Markov strategies, the simple pricing strategies that depend only on the "payoff-relevant information", and look for a perfect equilibrium in which the firms use Markov strategies. (They call it "Markov perfect equilibrium")

Despite the restriction to simple (Markov) strategies, multiple equilibria exist (indeed, there also exist several kinked-demand-curve equilibria). However, it can be shown that in any Markov perfect equilibrium, profits are always bounded away from the competitive profit (which is 0).
...
The intuition here is that if firms were stuck in the competitive price region, with the prospects of small profits in the future, a firm could raise its price dramatically and lure its rival to charge a high price for at least some time (the rival would not hurry back to nearly competitive prices). Thus tacit collusion is not only possible (as in the supergame approach) but necessary. Furthermore, it can be shown that there exists only one pair of equilibrium strategies that sustain industry profits close to the monopoly profit. These strategies from a symmetric kinked-demand-curve equilibrium at the monopoly price, and they are the only symmetric "renegotiation-proof" equilibrium strategies (whatever the current price, the firms cannot find an alternative Markov perfect equilibrium that they both prefer).
(Tirole (1988), p.256)

2010-06-02

Remarks on APS by Tirole (1988)

Original article (link) posted: 04/08/2005

Tirole (1988) mentions APS in supplementary section. His description about optimal collusions is very clear, so I will quote it here.

Abreu, Pearce and Stacchetti (1986, 1990) show that one can indeed restrict attention to a collusive phase and a punishment phase, characterized by payoffs V+ and V-, where V+ and V- are now the best and worst elements in the set of symmetric perfect equilibrium payoffs. Furthermore, the collusive phase and the punishment phase take simple forms. In the collusive phase, the firms produce output q+. The punishment phase is triggered by a tail test, i.e., it starts if the market price falls under some threshold level p+. Thus, the collusive phase is qualitatively similar to that presumed in Porter (1983) and Green and Porter (1984). The punishment phase, however, does not have a fixed length; rather, it resembles the collusive phase. The two firms produce (presumably high) output q- each. If the market price exceeds a threshold price p-, the game remains in the punishment phase; if it lies below p-, the game goes back to the collusive phase. Thus, the evolution between the two phases follows a Markovian process. The reader may be surprised by the "inverse tail test" in the punishment phase. The idea is that a harsh punishment requires a high output (higher than is even privately desirable); to ensure that the firms produce a high output, it is specified that in the case of a high price (which signals a low output) the game remains in the punishment phase. (Notice that if one restricted punishments to be of the Cournot type, the optimal length of punishment would be T = "infinity", from the APS result on the harshest possible punishment V-.)
(Tirole (1988), p.265)

2010-05-31

Great professor Matsushima

Original article (link) posted: 03/08/2005

At the last part of the renegotiation section in Pearce (1992), I found the interesting description about Matsushima (1990) "Structure of Renegotiation in Infinitely Repeated Games" mimeo, Stanford University.

I cannot end this catalogue without mentioning an intriguing paper by Matsushima (1990). His idea is that just as an equilibrium specifies what will happen if its "instructions" are not obeyed, societies have metacodes indicating what happens when a social convention (equilibrium) is breached. The ensuing analysis is highly ingenious; to my astonishment, Matsushima emerges from a jungle of infinite sequences of social conventions and breaching rules, with an existence result. I will not try to explain the motivation for the solution concept; on that score, despite some enjoyable discussions with the author, I remain mystified.
(Pearce (1992), p.166)

By the way, I could not find Matsuhima's paper; he does not seem to have published the paper, and Stanford does not have the working paper either. If you know something about this paper, please let me know. (maybe I should ask him directly)

2010-05-26

Folk theorems by FLM (1994)

Original article (link) posted: 03/08/2005

According to Pearce (1992), Fudenberg, Levine and Maskin (1994) "The Folk Theorem in Repeated Games with Imperfect Public Information" Econometrica, 62 represent the state of art in discounted folk theorems for a broad range of information structure, and "anyone interested in repeated games should read it closely".

It is difficult to state their folk theorems without relying on mathematical symbols. Here, I try to summarize the key insight as I quote the relevant sentences (in italic) from Pearce (1992).
To derive folk theorems, they consider two conditions, the individual full rank condition and the pairwise full rank condition. The former is needed to ensure that a player's different possible actions can be distinguished, and hence encouraged or discouraged. The latter is needed to acquire information which discriminates statistically between deviations by some player and others.
Without the individual full rank condition, it may not be possible to induce players to play some strategy, no matter what rewards are attached to signal realizations. In contrast,
this (="the individual full rank condition") guarantees that any behavior can be enforced if arbitrary continuation payoffs can be used.

The failure of the pairwise full rank condition explains the inefficient results by Radner, Myerson and Maskin (1986).
The problem there was that the only way to enforce good behavior was to punish both players in the event that output is low. Efficient (or nearly efficient) cooperation in a model where no player's actions are observed, generally requires that, when one player's continuation payoff is reduced, another's must be increased; surplus should be passed back and for the amongst players, not thrown away.

With the additional restrictions on the information structure guaranteed by the full rank conditions, FLM prove a folk theorem of virtually the same degree of generality as for perfect monitoring.

As a final remark, it should be noticed that the pairwise full rank condition does not necessarily be satisfied at equilibrium strategy profiles.

It would have been reasonable to guess that, to prove that a desired profile "gamma" can be enforced (almost) efficiently, it would be necessary to impose pairwise full rank relative to deviations form "gamma". By contrast, all that is actually assumed is that, for each "i" and "j", there is some distinguishing "alpha" that allows i's and j's deviations to be distinguished, and not necessarily the same "alpha" for each pair of players! FLM demonstrates that a profile as close as desired to "gamma" can be found that puts a little weight on the strategies used in the "distinguishing profiles," and discriminates as required between deviations of different players.

I should read FLM again...

2010-05-25

Information and timing

Original article (link) posted: 03/08/2005

With imperfect monitoring, shrinking the period length implies less discounting from one period to the next, but also leaves less time for players to observe signals relevant to behavior.
...
So there are two effects of reducing the period length: an effective increase in patience, which we know from the monotonicity result tends to increase the average value set, and a worsening of information, which Kandori (1992) elegantly shown to decrease the set of equilibrium values. Either of these two effects can dominate in a particular case.
...
If the period of fixed action is a year, under plausible parameter values cooperation could be sustained very profitably. But suppose that instead auctions can be changed daily. The only way to encourage cooperation is to punish the event that there is no goods news, which has probability near 1 whether anyone shirks or not.
...
Ironically, in this case the player's ability to respond quickly to information destroys all possibility of cooperation. This suggests that delaying the release of information might actually be valuable in partnerships; Abreu, Milgrom and Pearce (1991) show that for high "delta", information delays can virtually eliminate the inefficiency that Radner, Myerson and Maskin (1986) identified.
(Pearce (1992), p.156)

Abreu, Milgrom and Pearce (1991) "Information and Timing in Repeated Partnerships" Econometrica, 59
Kandori (1992) "The Use of Information in Repeated Games with Imperfect Monitoring" RES, 59-3
Radner, Myerson and Maskin (1986) "An example of a Repeated Partnership Games with Discounting and with Uniformly Inefficient Equilibria" RES, 53

2010-05-24

POMDP

Unpated (July 24, 2010)
Now, their paper is uploaded:
KANDORI, M. and Obara, I. (2010) "Towards a Belief-Based Theory of Repeated Games with Private Monitoring: An Application of POMDP"
(paper can be downloaded from the author's website)

When I had a chat with Prof. Kandori on Friday, he talked about his recent joint-work with Ichiro Obara (UCLA) on repeated games with private monitoring. The key idea is to use a well-known method in dynamic programming called POMDP (partially observable Markov decision process).
Then, they derived the (right notion of) one-shot deviation principle for private monitoring, and constructed a method to check whether an arbitrary strategy, expressed by finite automaton, constitutes an equilibrium or not. I couldn't follow the arguments in detail, but what they have found sounds really path-breaking.
Look forward to see their paper soon!

Information about POMDP websites
Useful site includes tutorials and codes. link
Lecture movie at Stanford (Machine learning: Lecture 20) link

2010-05-21

Discounting and dynamic programming

Original article (link) posted: 03/08/2005

Radner (1985) demonstrated the existence of fully efficient perfect equilibria in a class of partnership games with the limit of means criterion.
...
In my opinion the repeated partnership games (and most repeated games) are better modeled with discounting than with the limit of means criterion.
...
The per-period nature of the problem is nicely reflected in the discounting case, where the loss gets capitalized in the value set: player must be punished (sooner or later) if current output is low. Without discounting, it is not necessary to deter shirking period by period: if a player cheats for k periods, it has no effect on his long-run average payoff. Only infinite strings of deviations are a problem, and these Radner detects using a "review strategy" that, according to the law of the iterated log, will yield a first-best equilibrium average payoff.
...
Statistical methods are ideally suited to guarding against long-run deviations, whereas dynamic programming methods are largely inapplicable at "delta"=1. With discounting, the problem of deterring current deviations leads naturally to the decomposition of a supergame profile into behavior today and continuation values for the future. The dynamic programming perspective has the benefit of unifying the treatment of patient and impatient players, infinite and finite horizon games, and implicit and explicit contracts.
(Pearce (1992), p.154-5)

Reference

Radner (1985) "Repeated Partnership Games with Imperfect Monitoring and No Discounting" RES, 53