Home | Articles | Past volumes | About | Login | Notify | Contact | Search
 Stochastic Systems open journal systems 

Delay, memory, and messaging tradeoffs in distributed service systems

David Gamarnik, Massachusetts Institute of Technology
John N. Tsitsiklis, Massachusetts Institute of Technology
Martin Zubeldia, Massachusetts Institute of Technology


We consider the following distributed service model: jobs with unit mean, exponentially distributed, and independent processing times arrive as a Poisson process of rate \(\lambda n\), with \(0<\lambda<1\), and are immediately dispatched by a centralized dispatcher to one of \(n\) First-In-First-Out queues associated with \(n\) identical servers. The dispatcher is endowed with a finite memory, and with the ability to exchange messages with the servers.

We propose and study a resource-constrained “pull-based” dispatching policy that involves two parameters: (i) the number of memory bits available at the dispatcher, and (ii) the average rate at which servers communicate with the dispatcher. We establish (using a fluid limit approach) that the asymptotic, as \(n\to\infty\), expected queueing delay is zero when either (i) the number of memory bits grows logarithmically with \(n\) and the message rate grows superlinearly with \(n\), or (ii) the number of memory bits grows superlogarithmically with \(n\) and the message rate is at least \(\lambda n\). Furthermore, when the number of memory bits grows only logarithmically with \(n\) and the message rate is proportional to \(n\), we obtain a closed-form expression for the (now positive) asymptotic delay.

Finally, we demonstrate an interesting phase transition in the resource-constrained regime where the asymptotic delay is non-zero. In particular, we show that for any given \(\alpha>0\) (no matter how small), if our policy only uses a linear message rate \(\alpha n\), the resulting asymptotic delay is upper bounded, uniformly over all \(\lambda<1\); this is in sharp contrast to the delay obtained when no messages are used (\(\alpha = 0\)), which grows as \(1/(1-\lambda)\) when \(\lambda\uparrow1\), or when the popular power-of-\(d\)-choices is used, in which the delay grows as \(\log(1/(1-\lambda))\).

AMS 2000 subject classifications: 93E03

Keywords: Distributed service, dispatching policies, delay performance

Creative Common LOGO

Full Text: PDF

Gamarnik, David, Tsitsiklis, John N., Zubeldia, Martin, Delay, memory, and messaging tradeoffs in distributed service systems, Stochastic Systems, 8, (2018), 1-54 (electronic). DOI: 10.1214/17-SSY234.


[1]     Aghajani, R. and Ramanan, K. The hydrodynamic limit of a randomized load balancing network. arXiv preprint arXiv:1707.02005.

[2]     Badonnel, R. and Burgess, M. (2008). Dynamic pull-based load balancing for autonomic servers. In Proceedings of the Network Operations and Management Symposium (NOMS).

[3]     Bertsimas, D., Gamarnik, D. and Tsitsiklis, J. N. (2002). Performance of multiclass Markovian queueing networks via piecewise linear Lyapunov functions. The Annals of Applied Probability 11 1384–1428. MR1878302

[4]     Billingsley, P. (1999). Convergence of Probability Measures, Second ed. Wiley. MR1700749

[5]     Bramson, M. (1998). State space collapse with application to heavy traffic limits for multiclass queueing networks. Queueing Systems: Theory and Applications 30 89–148. MR1663763

[6]     Bramson, M., Lu, Y. and Prabhakar, B. (2013). Decay tails at equilibrium for FIFO join the shortest queue networks. The Annals of Applied Probability 23 1841–1878. MR3114919

[7]     Filippov, A. F. (1988). Differential Equations with Discontinuous Righthand Sides. Springer-Science. MR1028776

[8]     Foss, S. and Stolyar, A. L. Large-scale Join-Idle-Queue system with general service times. arXiv preprint arXiv:1605.05968.

[9]     Foster, F. G. (1953). On the stochastic matrices associated with certain queueing processes. The Annals of Mathematical Statistics 24 355–360. MR0056232

[10]     Gamarnik, D., Tsitsiklis, J. N. and Zubeldia, M. (2016). Delay, memory, and messaging tradeoffs in distributed service systems. In Proceedings of the ACM SIGMETRICS conference.

[11]     Hajek, B. (1982). Hitting-time and occupation-time bounds implied by drift analysis with applications. Advances in Applied Probability 14 502–525. MR0665291

[12]     Hunt, P. J. and Kurtz, T. G. (1994). Large loss networks. Stochastic Processes and their Applications 53 363–378. MR1302919

[13]     Kirszbraun, M. D. (1934). Über die zusammenziehende und Lipschitzsche Transformationen. Fund. Math 22 77–108.

[14]     Kurtz, T. G. (1981). Approximation of Population Processes. Society for Industrial and Applied Mathematics. MR0610982

[15]     Lobanov, S. G. and Smolyanov, O. G. (1994). Ordinary differential equations in locally convex spaces. Uspekhi Mat. Nauk 49 93–168. MR1289388

[16]     Lu, Y., Xie, Q., Kliot, G., Geller, A., Larus, J. R. and Greenberg, A. (2011). Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services. Performance Evaluation 68 1056–1071.

[17]     Mitzenmacher, M. D. (1996). The power of two choices in randomized load balancing PhD thesis, U.C. Berkeley. MR2695522

[18]     Mitzenmacher, M. (2016). Analyzing distributed Join-Idle-Queue: A fluid limit approach. In Proceedings of the 54th Annual Allerton Conference on Communication, Control, and Computing.

[19]     Mitzenmacher., M., Prabhakar., B. and Shah, D. (2002). Load balancing with memory. In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS).

[20]     Mukherjee, D., Borst, S., van Leeuwaarden, J. and Whiting, P. (2016). Universality of Power-of-d Load Balancing Schemes. In Proceedings of the Workshop on Mathematical performance Modeling and Analysis (MAMA).

[21]     Rudin, W. (1976). Principles of Mathematical Analysis, 3rd ed. McGraw-Hill. MR0385023

[22]     Shwartz, A. and Weiss, A. (1995). Large Deviations for Performance Analysis: Queues, Communications, and Computing. Chapman & Hall. MR1335456

[23]     Stolyar, A. L. (2015). Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems: Theory and Applications 80 341–361. MR3367704

[24]     Stolyar, A. L. (2017). Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers. Queueing Systems: Theory and Applications 85. MR3604117

[25]     Tsitsiklis, J. N. and Xu, K. (2012). On the power of (even a little) resource pooling. Stochastic Systems 2 1–66. MR2960735

[26]     van der Boor, M., Borst, S. and van Leeuwaarden, J. (2017). Load Balancing in Large-Scale Systems with Multiple Dispatchers. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM).

[27]     Vvedenskaya, N. D., Dobrushin, R. L. and Karpelevich, F. I. (1996). Queueing system with selection of the shortest of two queues: an asymptotic approach. Problems of Information Transmission 32 15–27. MR1384927

[28]     Xu, K. and Yun, S.-Y. Reinforcement with Fading Memories. Preprint.

[29]     Ying, L., Srikant, R. and Kang, X. (2015). The power of slightly more than one sample in randomized load balancing. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM).

Home | Articles | Past volumes | About | Login | Notify | Contact | Search

Stochastic Systems. ISSN: 1946-5238