Open Problems

Here are a couple of open problems I have found interesting during my research. Please feel free to get in touch if you are interested to discuss!

 

Problem 1

Do classic bandit algorithms such as UCB and Thompson sampling achive optimal order regret in the kernel-based bandit setting? See a formal description here.

 

Problem 2

Consider the noise-free kernel based bandit problem. Bull 2011 proved bounds on simple regret. Showing comparable bounds on cumulative regret remains an open problem.