70477 | ИПУ РАН

Автор(ы):

Автор(ов):

Параметры публикации

Тип публикации:

Статья в журнале/сборнике

Название:

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

ISBN/ISSN:

ISSN 0005-1179

DOI:

10.1134/S0005117922080100

Наименование источника:

Automation and Remote Control

Обозначение и номер тома:

Vol. 83, No. 8

Город:

Москва

Издательство:

Pleiades Publishing, Ltd.

Год издания:

2022

Страницы:

1288–1307

Аннотация

We consider the minimax setup for the two-armed bandit problem as applied to data processing if there are two alternative processing methods with different a priori unknown efficiencies. One should determine the most efficient method and provide its predominant application. To this end, we use the mirror descent algorithm (MDA). It is well known that the corresponding minimax risk has the order of N^{1/2}, where N is the amount of processed data, and this bound is order sharp. We propose a batch version of the MDA which allows processing data by packets; this is especially important if parallel data processing can be provided. In this case, the processing time is determined by the number of batches rather than the total amount of data. Unexpectedly, it has turned out that the batch version behaves unlike the ordinary one even if the number of packets is large. Moreover, the batch version provides a considerably lower minimax risk; i.e., it substantially improves the control performance. We explain this result by considering another batch modification of the MDA whose behavior is close to the behavior of the ordinary version and the minimax risk is close as well. Our estimates use invariant descriptions of the algorithms based on Gaussian approximations of income in the batches of data in the domain of “close” distributions and are obtained by Monte-Carlo simulation.

Библиографическая ссылка:

Колногоров А.В., Назин А.В., Шиян Д.H. Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm // Automation and Remote Control. 2022. Vol. 83, No. 8. С. 1288–1307.

Публикация имеет версию на другом языке или вышла в другом издании, например, в электронной (или онлайн) версии журнала:

Да

Связь с публикацией:

Колногоров А.В., Назин А.В., Шиян Д.H. Two-Armed Bandit Problem, Data Processing, and Parallel Version of the Mirror Descent Algorithm // arXiv: [math.ST]. 2017. С. 1-11 https://arxiv.org/pdf/1705.09977.