Sequential Learning in the Multi-Armed Bandit Setting

Our research project is concerned with the application of sequential learning algorithms to various multi-armed bandit problems. The setting we are currently interested in is that of the structured thresholding bandit problem. In the thresholding bandit setting one wishes to identify arms whose corresponding means are above a given threshold. In our particular case additional structure is also introduced, it is assumed the arms have monotonically increasing means. The upshot of this is that in sampling a certain arm, one gains information on the distribution of all arms, this leads to an interesting sequential learning problem. Our goal is to identify a minimax optimal algorithm to solve this problem given a limited budget of pulls. Leading on from this we would be interested in developing problem dependent bounds.

currently no upcoming news
...more
currently no upcoming news
...more