17583 | ИПУ РАН

Автор(ы):

Назин А. В. (ИПУ РАН, Лаборатория 07)

Миллер Б. М. (?)

Автор(ов):

Параметры публикации

Тип публикации:

Доклад

Название:

Mirror Descent Algorithm for Homogeneous Finite Controlled Markov Chains with Unknown Mean Losses

Электронная публикация:

Да

ISBN/ISSN:

978-3-902661-93-7

Наименование конференции:

18th IFAC World Congress (Milano, Italy, 2011)

Наименование источника:

Proceedings of the 18th IFAC World Congress (Milano, 2011)

Город:

Milan

Издательство:

International Federation of Automatic Control (IFAC)

Год издания:

2011

Страницы:

12421-12426

Аннотация

We consider the adaptive stochastic problem for a system described by a controlled Markov chain with a finite number of states. The novelty of the approach is in using of adaptation technique for optimization of the system with unknown distribution of the cost function. This approach is applicable to the Internet congestion control with active users, motivating our study. In fact, we consider a controlled homogenous finite Markov chain, introduce assumptions of processes independence, stationarity of positive random losses and a bound for their second moments, and regularity. The uncertainty is that the mean loss matrix is undefined. The related convex stochastic optimization problem as well as ideas of Mirror Descent Method give us the ability to design control algorithm and to prove the upper bound for the difference between expectation of time-averaged losses and their theoretical minimum. The main term of this upper bound, i.e. $O(sqrt{(N/T)ln(NK)})$, slightly depends on the state number $K$ which is standing inside the logarithm; here $N$ and $T$ stand for the number of control actions and the time horizon, respectively.

Библиографическая ссылка:

Назин А.В., Миллер Б.М. Mirror Descent Algorithm for Homogeneous Finite Controlled Markov Chains with Unknown Mean Losses / Proceedings of the 18th IFAC World Congress (Milano, 2011). Milano: International Federation of Automatic Control (IFAC), 2011. С. 12421-12426.