This webpage offers a series of benchmark problems for testing ADP/RL algorithms. In each we have found the *optimal* policy by creating and solving a discrete version of the problem.
We have found that popular algorithms based on using various machine learning algorithms can work surprisingly poorly on classical inventory/storage problems. See
Daniel Jiang, Thuy Pham, Warren B. Powell, Daniel Salas, Warren Scott, “A Comparison of Approximate Dynamic Programming Techniques on Benchmark Energy Storage Problems: Does Anything Work?,” IEEE Symposium Series on Computational Intelligence, Workshop on Approximate Dynamic Programming and Reinforcement Learning, Orlando, FL, December, 2014.
- Energy storage datasets I – prepared by Daniel Salas
Monotone problems – The value function is monotone in each dimension of the state variable
- Energy storage datasets II – prepared by Daniel Jiang (Created June 3 2015)
- Optimal stopping problems – prepared by Daniel Jiang (Created June 3 2015)
The datasets reflect a relatively simple energy storage problem depicted by (in its full form) one battery, a variable (but free) stochastic source (wind or solar), a limitless source at a random prices (from the grid), to serve a fairly predictable but time varying load. We visualiez the problem using
The Princeton energy storage benchmark datasets are a series of finite horizon problems that consist of four components:
- A renewable source of energy (free, but variable and usually stochastic).
- The power grid – an infinite supply of energy (and a market) at a random price.
- A load – usuallly time dependent, usually stochastic.
- A single storage device used to smooth out flows
Most of these problems use time-dependent processes. These might reflect a daily cycle for energy storage, or they are simply randomly generated from a time-dependent process.
The problems are described in the paper
The problems below include both deterministic and stochastic settings. The optimal benchmark for the deterministic problems was computed by solving the full problem as a linear program. The stochastic problems were solved as discrete Markov decision processes. A description of how to use the datasets is contained in
The datasets include matlab code for generating the scenarios. For non-Matlab types, the scenarios are contained in a text file so that you can compare against exact benchmarks.
We have been undertaking a body of research where we exploit monotonicity in the value function. The monotone-ADP algorithm, and descriptions of the datasets, are given in
These datasets are based on the Salas storage datasets (above), but includes stochastic demands, and uses a more compact way of representing the optimal policy. The datasets, with complete software and documentation, can be downloaded from:
The optimal stopping datasets, with complete software and documentation, can be downloaded from