Adjustment of Discount Rate Using Index for Progress of Learning
We showed that it can be effective to adjust the discount rate using an index for progress of learning. In the strategy that we propose, the discount rate is small when the learning does not progress enough, and is increased as the learning advances. We also proposed three methods for its adjustment; exponential, by TD error, and by reliability, which were verificated by numerical experiments for a windy grid world task.
Success rate per every 20 episodes. Left: conventional. Right: Proposed.
Transition of normalized action value function. Left: conventional. Right: Proposed.
Reference (in Japanese)
- Naoko Ogawa, Akio Namiki and Masatoshi Ishikawa. Adjustment of Discount Rate Using Index for Progress of Learning. IEICE Neurocomputing Meeting (Sapporo, Japan, 2003.2.4) / IEICE Technical Report, NC2002-129, pp. 73-78, Feb. 2003. [PDF (1.2M)]