Discounting and dynamic programming

Original article (link) posted: 03/08/2005

Radner (1985) demonstrated the existence of fully efficient perfect equilibria in a class of partnership games with the limit of means criterion.
In my opinion the repeated partnership games (and most repeated games) are better modeled with discounting than with the limit of means criterion.
The per-period nature of the problem is nicely reflected in the discounting case, where the loss gets capitalized in the value set: player must be punished (sooner or later) if current output is low. Without discounting, it is not necessary to deter shirking period by period: if a player cheats for k periods, it has no effect on his long-run average payoff. Only infinite strings of deviations are a problem, and these Radner detects using a "review strategy" that, according to the law of the iterated log, will yield a first-best equilibrium average payoff.
Statistical methods are ideally suited to guarding against long-run deviations, whereas dynamic programming methods are largely inapplicable at "delta"=1. With discounting, the problem of deterring current deviations leads naturally to the decomposition of a supergame profile into behavior today and continuation values for the future. The dynamic programming perspective has the benefit of unifying the treatment of patient and impatient players, infinite and finite horizon games, and implicit and explicit contracts.

(Pearce (1992), p.154-5)


Radner (1985) "Repeated Partnership Games with Imperfect Monitoring and No Discounting" RES, 53

No comments: