In a service system, the system performance is sensitive to customer abandonment. We focus on $G/GI/n+GI$ parallel-server queues that serve as a building block to model service systems. Consistent with recent empirical findings, such a queue has a general arrival process (the $G$) that can be time non-homogeneous, iid service times with a general distribution (the first $GI$), and iid patience times with a general distribution (the $+GI$). Following the square-root safety staffing rule, companies are able to operate such queues in the quality- and efficiency-driven (QED) regime that is characterized by large customer volume, the waiting times being a fraction of the service times, only a small fraction of customers abandoning the queue, and high server utilization. We survey recent results on many-server queues that operate in the QED regime. These results include the sensitivity of patience time distributions and diffusion models as a practical tool for performance analysis.
This tutorial is jointly presented with Shuangchi He at the ISE department of National University of Singapore.