NSF Workshop

A Conversation between Artificial Intelligence, Operations Research and Control Theory on Stochastic Optimization

Warren B. Powell

Satinder Singh

The problem of making decisions under uncertainty arises in a wide range of settings. As a result, different communities have developed models and algorithms suited to specific problem classes. A byproduct of this history is a variety of notational styles and algorithmic strategies that can impede the effective sharing of ideas between communities.

This website has grown out of an NSF-funded workshop that was held May 30/June 1, 2012 at Rutgers University. However, the goal is to make this a continuing discussion. The original website, which includes the list of attendees, is still available here.

Please email any comments to powell@princeton.edu. I am thinking of putting together a general email list, but an alternative strategy is to build on existing email lists such as the RL chat list (from artificial intelligence), and the list for COSP (the stochastic programming community). I am also working on a blog that allows people to post comments publicly (much easier than emails).

Below are some conversation threads. I hope to link this to a blog, but I wanted to get the topics rolling. Please email me if you want to start a new conversation (and be patient with my lack of web skills).

Bridging stochastic search, dynamic programming and stochastic programming - These fields seem to intersect so much that perhaps we need to think of them as a single field.

What is a state variable - One of my favorite topics that was started during the workshop.

A name for our community - I think we need an umbrella name to bring our disparate communities together.

Topics from the workshop - If you are interested in what the workshop covered, go here.

Workshop program - The complete workshop program.

Please email me to start any new threads (powell@princeton.edu).


Bridging stochastic search, dynamic programming and stochastic programming

The fields of stochastic search, dynamic programming and stochastic programming have so much overlap that we need to start thinking of them as a single field. The first article below is a short discussion that appeared in the Informs Computing Society Newsletter. The second is an evolving discussion document (not intended for publication). The third is a paper that picks up this discussion using the context of transportation and logistics.

A Unified Framework for Stochastic Optimization (Informs Computing Society Newsletter article - Fall, 2012) - This is a short article that describes links between stochastic search, dynamic programming and stochastic programming, drawing on the discussions in the longer articles below.

AI, OR and Control Theory: A Rosetta Stone for Stochastic Optimization Updated July 13, 2012 - This is an informal discussion article (not intended for publication) where I make an attempt to bridge the different communities (just click on the link in the header). Although the original workshop focused on AI and OR, it is impossible to talk about certain topics in stochastic optimization (especially approximate dynamic programming) without recognizing the important contributions of the control theory community. This paper will be updated periodically. Comments are appreciated, either as notes on the pdf, or in email (but if you send email, please include the date on the front of the paper). Please view this article as simply a continuation of the conversation we started at the workshop.

Approximate Dynamic Programming in Transportation and Logistics: A Unified Framework (to appear in European J. on Transportation and Logistics) - This paper describes modeling of stochastic dynamic systems, introduces the four fundamental classes of policies (that I have been able to identify so far), and illustrates these ideas in the context of applications in transportation and logistics. There is a section dedicated to creating links between stochastic search, dynamic programming and stochastic programming (which is the basis for the discussion in the paper above).


What is a state variable?

This is a conversation started before the workshop, but we did not have much time to continue it during the workshop. There were 30 responses to this question. Everyone then voted on the answers. The scores were averaged, organized by whether the answer was designed by someone from CS/AI, or OR. Also, the responses were organized by whether they were from CS/AI or OR. The full spreadsheet is available here.

Two ideas seemed to stand out from the responses. The first was that a state variable should be a sufficient statistic, which means it contains all the necessary information. The second was that it should be "efficient," "parsimonious," or "minimal."

Two objections were raised about the use of the term "sufficient statistic." First, it can be seen as replacing one piece of jargon ("state variable") with another, which then needs its own definition. The second is that the term "sufficient statistic" is widely used in statistic where it has its own meaning.

There also seems to be a feeling that a state variable should only contain necessary information. Most models do this implicitly, but not all. The stochastic programming community routinely uses the entire history, which is certainly a sufficient statistic, but not necessary.

My favorite definition is from chapter 5 of my ADP book (you can download this from http://adp.princeton.edu). It reads

A state variable is the minimally dimensioned function of history that is necessary and sufficient to compute the decision function, the transition function, and the cost/reward function.

This definition is consistent with the concept of a sufficient statistic, but requires that the information be necessary as well (and therefore minimally dimensioned, which is redundant). It also provides clear guidelines for necessary - it is the data needed to compute the decision function (for example, the set of feasible actions might depend on the state), the cost/reward function, and the transition function. Presumably, if a variable is only needed in the transition function, then it should in some way support the need to compute the cost/reward function or the feasible region.

I believe that this is also consistent with the insistence that the state variable be independent of the choice of policy, although this raises some subtle issues. The state variable may contain information needed to determine the set of feasible actions (a.k.a. the feasible region) which is needed to compute the decision function.

Please share your thoughts by sending them to powell@princeton.edu. I will post comments that seem to be of general interest.


Computational stochastic optimization

Reinforcement learning, approximate dynamic programming, stochastic programming, stochastic search, simulation optimization, control theory, neuro-dynamic programming, heuristic dynamic programming, adaptive dynamic programming, policy search. That's a lot of names describing what we do.

Modern stochastic optimization requires a cross fertilization of skills, and important ideas have emerged from across these communities. It would be great if we all used the same notation and one set of terminology. This may take a while. But perhaps we can agree on an umbrella name.

I propose "computational stochastic optimization." Let me know what you think.


From the original workshop:


The workshop will be organized around the following primary topics:


People tend to understand mathematical models in the context of the problems we are familiar with. The workshop will start with a series of very short presentations describing specific applications that pose specific algorithmic challenges.


In sharp contrast with deterministic optimization, stochastic optimization is characterized by a range of modeling styles. We lack a consensus on fundamental concepts such as a state variables, which complicates the communication of models and algorithms.

Challenge problems:

Each community tends to develop tools that address a class of problems, and then tends to focus on problems that fit these tools. We will identify specific challenge problems that are likely to attract broad interest from the workshop.

Algorithmic strategies:

The heart of the workshop will be to compare and contrast different algorithmic strategies that have evolved within different communities. The ultimate goal of the field is the design of robust, general purpose algorithms that work for major problem classes. To do this, we need to recognize that specific algorithmic strategies seem to be particularly well suited to specific problem classes. What are these problem classes, and how do we identify them? How do we push the envelope to make algorithms more general and more robust?

Working groups will be tasked with tackling one of the challenge problems,

Contrasting research styles of different communities

Sharing work may involve publishing in the outlets of different communities. The computer science and operations reseach communities have remarkably different styles in how and where they present their research. We hope to bring out differences in styles to foster sharing across communities.