masterhead masterhead  masterhead

Adjustment of Discount Rate Using Index for Progress of Learning

Summary

We showed that it can be effective to adjust the discount rate using an index for progress of learning. In the strategy that we propose, the discount rate is small when the learning does not progress enough, and is increased as the learning advances. We also proposed three methods for its adjustment; exponential, by TD error, and by reliability, which were verificated by numerical experiments for a windy grid world task.

success rate for the conventional method success rate for the proposed method
Success rate per every 20 episodes. Left: conventional. Right: Proposed.

transition of normalized action value function for the conventional method transition of normalized action value function for the conventional method
Transition of normalized action value function. Left: conventional. Right: Proposed.

Reference (in Japanese)

  1. Naoko Ogawa, Akio Namiki and Masatoshi Ishikawa. Adjustment of Discount Rate Using Index for Progress of Learning. IEICE Neurocomputing Meeting (Sapporo, Japan, 2003.2.4) / IEICE Technical Report, NC2002-129, pp. 73-78, Feb. 2003. [PDF (1.2M)]
Ishikawa Senoo Laboratory, Department of Information Physics and Computing, Department of Creative Informatics,
Graduate School of Information Science and Technology, University of Tokyo
Ishikawa Senoo Laboratory WWW admin: www-admin@k2.t.u-tokyo.ac.jp
Copyright © 2008 Ishikawa Senoo Laboratory. All rights reserved.