online decision-making

Adaptive Discretization for Model-Based Reinforcement Learning

We introduce the technique of adaptive discretization to design efficient model-based episodic reinforcement learning algorithms in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration …

Online Allocation and Pricing: Constant Regret via Bellman Inequalities

We develop a framework for designing simple and efficient policies for a family of online allocation and pricing problems, that includes online packing, budget-constrained probing, dynamic pricing, and online contextual bandits with knapsacks. In …

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is …

Predict and Match: Prophet Inequalities with Uncertain Supply

We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on …

Uniform Loss Algorithms for Online Stochastic Decision-Making With Applications to Bin Packing

We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on …

Predict and Match: Prophet Inequalities with Uncertain Supply

We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on …

The Bayesian Prophet: A Low-Regret Framework for Online Decision Making

We develop a new framework for designing online policies given access to an oracle providing statistical information about an offline benchmark. Having access to such prediction oracles enables simple and natural Bayesian selection policies, and …

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is …

The Bayesian Prophet: A Low-Regret Framework for Online Decision Making

We develop a new framework for designing online policies given access to an oracle providing statistical information about an offline benchmark. Having access to such prediction oracles enables simple and natural Bayesian selection policies, and …

Ride Sharing

Ridesharing platforms such as Didi, Lyft, Ola and Uber are increasingly important components of the transportation infrastructure. However, our understanding of their design and operations, and their effect on society at large, is not yet well …