mardi 6 mars 2018

PGMO lectures on Bandit convex optimisation (13-14 mars)

Sébastien Bubeck donnera les 13 et 14 mars à l'Ecole Polytechnique, une série de cours intitulée "Bandit convex optimization", dont voici l'annonce :
Dear Colleagues,

The Gaspard Monge Program for Optimisation and Operations Research and their interactions with data sciences (PGMO) is pleased to advertise the exceptional lecture series of

Sebastien BUBECK (Microsoft)



at Ecole polytechnique, Palaiseau, March 13--14, 2018

This is jointly organized by Ecole polytechnique and the master
program ``Optimization'' of Université-Paris Saclay, in the framework
of the PGMO program of Fondation Mathematique Jacques Hadamard,
with the support of EDF, Orange and Thales.

This lecture is intended to graduate students and researchers
(from academics or industry).

The schedule is the following
Lecture 1 - Tuesday, March 13, 10h00-12h30 Amphi Becquerel
Lecture 2 - Tuesday, March 13, 14h00-16h30, Amphi Becquerel
Lecture 3 - Wednesday, March 14, 10h00-12h30, Amphi Becquerel
Lecture 4 - Wednesday, March 14, 14h00-16h30, Amphi Becquerel

Registration (free of charge) is requested; register by filling
the form:

Owing to ``vigipirate'' security measures, attendees may have to
show an identification document.

To access to school:

For local information, contact: (PGMO, FMJH) (DEPMAP, Ecole polytechnique).

The scientific organizers
P. Carpentier (ENSTA)
S. Charousset (EDF)
S. Gaubert (INRIA and Ecole polytechnique)
J.-C. Pesquet (CentraleSupélec)
F. Santambrogio (Université Paris Sud)
G. Stoltz (Université Paris Sud, CNRS)
T. Tomala (HEC)


Summary of the lecture

The multi-armed bandit and its variants have been around for more than 80 years, with applications ranging from medial trials in the 1930s to ad placement in the 2010s. In this mini-course I will focus on a groundbreaking model introduced in the 1990s which gets rid of the unrealistic i.i.d. assumption that is standard in statistics and learning theory. This paradigm shift leads to exciting new mathematical and algorithmic challenges. I will focus the lectures on the foundational results of this burgeoning field, as well as their connections with classical problems in mathematics such as the geometry of martingales and high-dimensional phenomena.

Lecture 1: Introduction to regret. Game theoretic viewpoint (duality, Bayesian version of the game) and derivation of the minimax regret via geometry of martingales (brief recall of type/cotype and entropic proof for ell_1).

Lecture 2: Introduction to the mirror descent algorithm. Connections with competitive analysis in online computations will also be discussed.

Lecture 3: Bandit Linear Optimization. Two proofs of optimal regret: one via low-rank decomposition in the information theoretic argument, and the other via mirror descent with self-concordant barriers.

Lecture 4 : Bandit Convex optimization 1. Kernel methods for online learning, Bernoulli convolution based kernel. 2. Gaussian approximation of Bernoulli convolutions, and restart type strategies.

Aucun commentaire:

Enregistrer un commentaire