# On temporal difference algorithms for continuous systems

@inproceedings{Donz2005OnTD, title={On temporal difference algorithms for continuous systems}, author={Alexandre Donz{\'e}}, booktitle={ICINCO}, year={2005} }

This article proposes a general, intuitive and rigorous framework for designing temporal differences algorithms to solve optimal control problems in continuous time and space. Within this framework, we derive a version of the classical TD(λ) algorithm as well as a new TD algorithm which is similar, but designed to be more accurate and to converge as fast as TD(λ) for the best values of λ without the burden of finding these values.

#### 5 Citations

Temporal-difference learning for online reachability analysis

- Computer Science
- 2015 European Control Conference (ECC)
- 2015

This work proposes a novel online reachability update algorithm based on Temporal-Difference learning that is computationally more efficient and outperforms standard reachability-based controllers when it comes to other (non-safety) objectives. Expand

Goal-Oriented Control of Self-Organizing Behavior in Autonomous Robots

- Computer Science
- 2010

We study adaptive control algorithms within a dynamical systems approach for autonomous robots that cause the self-organization of coordinated behaviors without specific goals or particular… Expand

Hybrid Systems 1.1 Introduction

The research on hybrid systems at Verimag has as a major objective to export some ideas and insights originating from computer science toward other domains of applied science and engineering that do… Expand

#### References

SHOWING 1-10 OF 13 REFERENCES

Temporal Difference Learning in Continuous Time and Space

- Mathematics, Computer Science
- NIPS
- 1995

A continuous-time, continuous-state version of the temporal difference (TD) algorithm is derived in order to facilitate the application of reinforcement learning to real-world control tasks and… Expand

Reinforcement Learning in Continuous Time and Space

- Mathematics, Medicine
- Neural Computation
- 2000

This article presents a reinforcement learning framework for continuous-time dynamical systems without a priori discretization of time, state, and action. Basedonthe Hamilton-Jacobi-Bellman (HJB)… Expand

A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions

- Mathematics, Computer Science
- Machine Learning
- 2004

A general convergence theorem is derived for RL algorithms when one uses only “approximations” of the initial data, which can be used for model-based or model-free RL algorithms, with off-line or on-line updating methods, for deterministic or stochastic state dynamics, and based on FE or FD discretization methods. Expand

Relaxed dynamic programming in switching systems

- Mathematics
- 2006

In order to simplify computational methods based on dynamic programming, a relaxed procedure based on upper and lower bounds of the optimal cost was recently introduced. The convergence properties of… Expand

Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems

- Mathematics, Computer Science
- IJCAI
- 1999

This paper describes variable resolution policy and value function representations based on Kuhn triangulations embedded in a kd-tree and derives a splitting criterion that allows one cell to efficiently take into account its impact on other cells when deciding whether to split. Expand

On the Convergence of Optimistic Policy Iteration

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2002

A finite-state Markov decision problem is considered and the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values, in conjunction with greedy policy selection is established. Expand

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

- Computer Science
- NIPS
- 1995

It is concluded that reinforcement learning can work robustly in conjunction with function approximators, and that there is little justification at present for avoiding the case of general λ. Expand

Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur)

- Physics, Computer Science
- 2002

The continuous TD(lambda) algorithm is refined to handle situations with discontinuous states and controls, and the vario-eta algorithm is proposed as a simple but efficient method to perform gradient descent. Expand

Temporal Difference Learning and TD-Gammon

- Computer Science
- J. Int. Comput. Games Assoc.
- 1995

TD-GAMMON is a neural network that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome. Expand

Reinforcement Learning: An Introduction

- Computer Science
- IEEE Transactions on Neural Networks
- 2005

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand