Fast-tracking stationary MOMDPs for adaptive management problems

Martin Peron, Peter Bartlett, Kai Helge Becker, Iadine Chades

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

Abstract

Adaptive management is applied in conservation and natural resource management, and consists of making sequential decisions when the transition matrix is uncertain. Informally described as ’learning by doing’, this approach aims to trade off between decisions that help achieve the objective and decisions that will yield a better knowledge of the true transition matrix. When the true transition matrix is assumed to be an element of a finite set of possible matrices, solving a mixed observability Markov decision process (MOMDP) leads to an optimal trade-off but is very computationally demanding. Under the assumption (common in adaptive management) that the true transition matrix is stationary, we propose a polynomial-time algorithm to find a lower bound of the value function. In the corners of the domain of the value function (belief space), this lower bound is provably equal to the optimal value function. We also show that under further assumptions, it is a linear approximation of the optimal value function in a neighborhood around the corners. We evaluate the benefits of our approach by using it to initialize the solvers MO-SARSOP and Perseus on a novel computational sustainability problem and a recent adaptive management data challenge. Our approach leads to an improved initial value function and translates into significant computational gains for both solvers.
LanguageEnglish
Title of host publicationProceedings of the Thirty-First AAAI Conference on Artificial Intelligence
Place of PublicationPalo Alto
Number of pages8
Publication statusAccepted/In press - 12 Nov 2016
EventThirty-First AAAI Conference on Artificial Intelligence - Hilton Hotel, San Francisco, United States
Duration: 4 Feb 20179 Feb 2017
http://www.aaai.org/Conferences/AAAI/aaai17.php

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
PublisherAssociation for the Advancement of Artificial Intelligence
ISSN (Print)2159-5399

Conference

ConferenceThirty-First AAAI Conference on Artificial Intelligence
Abbreviated titleAAAI-17
CountryUnited States
CitySan Francisco
Period4/02/179/02/17
Internet address

Fingerprint

Natural resources management
Observability
Information management
Sustainable development
Conservation
Polynomials

Keywords

  • adaptive management
  • Markov decision processes
  • optimal trade-offs
  • polynomial-time algorithm
  • value function
  • mixed observability

Cite this

Peron, M., Bartlett, P., Becker, K. H., & Chades, I. (Accepted/In press). Fast-tracking stationary MOMDPs for adaptive management problems. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (Proceedings of the AAAI Conference on Artificial Intelligence). Palo Alto.
Peron, Martin ; Bartlett, Peter ; Becker, Kai Helge ; Chades, Iadine. / Fast-tracking stationary MOMDPs for adaptive management problems. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto, 2016. (Proceedings of the AAAI Conference on Artificial Intelligence).
@inproceedings{ccfd3a5f31e84ac4a4a1fb411486df1e,
title = "Fast-tracking stationary MOMDPs for adaptive management problems",
abstract = "Adaptive management is applied in conservation and natural resource management, and consists of making sequential decisions when the transition matrix is uncertain. Informally described as ’learning by doing’, this approach aims to trade off between decisions that help achieve the objective and decisions that will yield a better knowledge of the true transition matrix. When the true transition matrix is assumed to be an element of a finite set of possible matrices, solving a mixed observability Markov decision process (MOMDP) leads to an optimal trade-off but is very computationally demanding. Under the assumption (common in adaptive management) that the true transition matrix is stationary, we propose a polynomial-time algorithm to find a lower bound of the value function. In the corners of the domain of the value function (belief space), this lower bound is provably equal to the optimal value function. We also show that under further assumptions, it is a linear approximation of the optimal value function in a neighborhood around the corners. We evaluate the benefits of our approach by using it to initialize the solvers MO-SARSOP and Perseus on a novel computational sustainability problem and a recent adaptive management data challenge. Our approach leads to an improved initial value function and translates into significant computational gains for both solvers.",
keywords = "adaptive management, Markov decision processes, optimal trade-offs, polynomial-time algorithm, value function, mixed observability",
author = "Martin Peron and Peter Bartlett and Becker, {Kai Helge} and Iadine Chades",
year = "2016",
month = "11",
day = "12",
language = "English",
series = "Proceedings of the AAAI Conference on Artificial Intelligence",
publisher = "Association for the Advancement of Artificial Intelligence",
booktitle = "Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence",

}

Peron, M, Bartlett, P, Becker, KH & Chades, I 2016, Fast-tracking stationary MOMDPs for adaptive management problems. in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, United States, 4/02/17.

Fast-tracking stationary MOMDPs for adaptive management problems. / Peron, Martin; Bartlett, Peter; Becker, Kai Helge; Chades, Iadine.

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto, 2016. (Proceedings of the AAAI Conference on Artificial Intelligence).

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

TY - GEN

T1 - Fast-tracking stationary MOMDPs for adaptive management problems

AU - Peron, Martin

AU - Bartlett, Peter

AU - Becker, Kai Helge

AU - Chades, Iadine

PY - 2016/11/12

Y1 - 2016/11/12

N2 - Adaptive management is applied in conservation and natural resource management, and consists of making sequential decisions when the transition matrix is uncertain. Informally described as ’learning by doing’, this approach aims to trade off between decisions that help achieve the objective and decisions that will yield a better knowledge of the true transition matrix. When the true transition matrix is assumed to be an element of a finite set of possible matrices, solving a mixed observability Markov decision process (MOMDP) leads to an optimal trade-off but is very computationally demanding. Under the assumption (common in adaptive management) that the true transition matrix is stationary, we propose a polynomial-time algorithm to find a lower bound of the value function. In the corners of the domain of the value function (belief space), this lower bound is provably equal to the optimal value function. We also show that under further assumptions, it is a linear approximation of the optimal value function in a neighborhood around the corners. We evaluate the benefits of our approach by using it to initialize the solvers MO-SARSOP and Perseus on a novel computational sustainability problem and a recent adaptive management data challenge. Our approach leads to an improved initial value function and translates into significant computational gains for both solvers.

AB - Adaptive management is applied in conservation and natural resource management, and consists of making sequential decisions when the transition matrix is uncertain. Informally described as ’learning by doing’, this approach aims to trade off between decisions that help achieve the objective and decisions that will yield a better knowledge of the true transition matrix. When the true transition matrix is assumed to be an element of a finite set of possible matrices, solving a mixed observability Markov decision process (MOMDP) leads to an optimal trade-off but is very computationally demanding. Under the assumption (common in adaptive management) that the true transition matrix is stationary, we propose a polynomial-time algorithm to find a lower bound of the value function. In the corners of the domain of the value function (belief space), this lower bound is provably equal to the optimal value function. We also show that under further assumptions, it is a linear approximation of the optimal value function in a neighborhood around the corners. We evaluate the benefits of our approach by using it to initialize the solvers MO-SARSOP and Perseus on a novel computational sustainability problem and a recent adaptive management data challenge. Our approach leads to an improved initial value function and translates into significant computational gains for both solvers.

KW - adaptive management

KW - Markov decision processes

KW - optimal trade-offs

KW - polynomial-time algorithm

KW - value function

KW - mixed observability

UR - http://www.aaai.org/Conferences/AAAI/aaai17.php

UR - http://www.aaai.org/Press/Proceedings/proceedings.php

M3 - Conference contribution book

T3 - Proceedings of the AAAI Conference on Artificial Intelligence

BT - Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

CY - Palo Alto

ER -

Peron M, Bartlett P, Becker KH, Chades I. Fast-tracking stationary MOMDPs for adaptive management problems. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto. 2016. (Proceedings of the AAAI Conference on Artificial Intelligence).