76368

Автор(ы): 

Автор(ов): 

6

Параметры публикации

Тип публикации: 

Статья в журнале/сборнике

Название: 

Implicitly normalized forecaster with clipping for linear and non‑linear heavy‑tailed multi‑armed bandits

Электронная публикация: 

Да

ISBN/ISSN: 

1619-6988

DOI: 

10.1007/s10287-023-00500-z

Наименование источника: 

  • Computational Management Science

Обозначение и номер тома: 

Vol. 21, No. 19

Город: 

  • Лондон

Издательство: 

  • Springer Nature

Год издания: 

2024

Страницы: 

1-29 https://link.springer.com/article/10.1007/s10287-023-00500-z
Аннотация
The Implicitly Normalized Forecaster (INF) algorithm is considered to be an optimal solution for adversarial multi-armed bandit (MAB) problems. However, most of the existing complexity results for INF rely on restrictive assumptions, such as bounded rewards. Recently, a related algorithm was proposed that works for both adversarial and stochastic heavy-tailed MAB settings. However, this algorithm fails to fully exploit the available data. In this paper, we propose a new version of INF called the Implicitly Normalized Forecaster with clipping (INF-clip) for MAB problems with heavy-tailed reward distributions. We establish convergence results under mild assumptions on the rewards distribution and demonstrate that INF-clip is optimal for linear heavy-tailed stochastic MAB problems and works well for non-linear ones. Furthermore, we show that INF-clip outperforms the best-of-both-worlds algorithm in cases where it is difficult to distinguish between different arms.

Библиографическая ссылка: 

Дорн Ю.В., Корнилов Н.М., Кутузов Н.В., Назин А.В., Горбунов Э.А., Гасников А.В. Implicitly normalized forecaster with clipping for linear and non‑linear heavy‑tailed multi‑armed bandits // Computational Management Science. 2024. Vol. 21, No. 19. С. 1-29 https://link.springer.com/article/10.1007/s10287-023-00500-z.