Understanding the performance metrics of queueing systems is one of the areas where simulation has its strengths. In most operations planning, queues are considered wasteful and actions should be taken to reduce and minimize them. Queues at e.g. airports, train stations or factories are known to be very costly and a great deal of research is often undertaken to avoid them. In this article, I will present three common queues using the Kendall notation system (1953), explain the mathematics behind some key performance metrics and prove them by running simulations in AnyLogic.

## Queueing systems structures and their AnyLogic implementation

There are generally three things we care about in a queue:

- The arrival rate of the entities
- The processing rate of the entities
- The inventory of entities

Also, in queueing systems we have the following components:

- An entry point
- A queue
- A service point with a given amount of servers
- An exit point

Such queueing systems can be implemented with the following elements in AnyLogic:

The first block is called a “source” and this is the entry point of the entire queue system. The three following blocks “seize”, “delay” and “release” make up the logic of the queue. **seize **makes sure that when all servers are busy, the upcoming entity has to wait until a server is idle. **delay **is the actual processing of entities and it has a corresponding processing time. **release **makes sure that **seize **opens up, and thereafter lets in the next entity that was waiting for service. “sink” is the exit point of the simulation and it keeps track of how many entities have left the system. Lasty, the “resource pool” block defines how many servers are available to serve incoming entities.

## Basic math and D.G. Kendall’s notation for queueing systems

David George Kendall (1953) produced a compact and easily interpretable notation for queueing systems. It is, still today, a highly relevant way to refer to queue properties. The notation system goes as follows:

**A**: Interarrival time distribution

**B**: Service time distribution

**C**: Number of parallel servers

**D**: System capacity

**E**: Size of calling population

The letters are then placed sequentially with a corresponding value separated by “/” like so: A/B/C/D/E. Consider the following queueing system notation:

Here, we have a queue where the interarrival time distribution (**A**) is exponential, service time distribution (**B**) is exponential, the number of parallel servers (**C**) is 1, and lastly the system capacity (**D**) and size of calling population (**E**) is infinite. In compact notations we usually omit the parameters with infinite values, and we simply refer to it as an **M/M/1** queue.

In practice, this might be an unloading operation of vessels with only a single crane or a supermarket queue with only one cashier.

Two other queueing systems I will cover briefly are the following:

where we have **c** servers and

where we have **c** servers and system capacity **k**.

In compact notation form these are referred to as **M/M/c** and **M/M/c/k** queues. An example of an **M/M/c** queue can be a conditional conveyor allocating an entering entity to the idle machine out of a pool of **c** machines. An example of an **M/M/c/k** queue can be a packing facility with **c** operators and can only have a **k** amount of packages inside the building.

Going forward, we are interested in obtaining performance metrics of these queues. We begin with the M/M/1 single server queue, and establish some essential metrics:

Utilization of the server is given by:

where

λ = Arrival rate

μ = Service rate

This metric is quite simple and intuitive. If you e.g. have an arrival rate of 3 and a service rate of 4, then the server will be busy ¾ or 75% of the operational time.

The average inventory of the queue and of the entire system (queue + processing) is given by the two equations respectively:

where

ρ = utilization

Lastly, the average throughput of the queue and of the entire system (queue + processing) is given by the two equations respectively:

where

ρ = utilization

μ = service rate

Using these equations, we can apply it to a practical case study in the next sub-chapter.

## A practical case study using mathematics and simulation

In a manufacturing process we have one machine processing incoming units. If the machine is busy processing a unit, the next unit is placed in a queue with a FIFO (First-In-First-Out) queueing discipline. The manufacturing plant has a large capacity, so for simplicity the capacity is set to infinite. The calling population is also set to infinity. The following data is known for the system at steady state:

Interarrival rate | exponential(15,0) |

Service rate | exponential(20,0) |

Number of parallel servers | 1 |

System capacity | ∞ |

Calling population | ∞ |

Queueing discipline | FIFO |

Based on the data above, we calculate the metrics from using the equations from the previous chapter:

Utilization: ⍴ = 15/20 = **0.75**

Average inventory of the queue: Lq = **2.25**

Average inventory of the system: L = **3**

Average throughput of the queue: Wq = **0.15**

Average throughput of the system: W = **0.2**

From these computations, we can interpret that the machine is busy most of the time (75%) and the queue in front of the machine is on average 2.25 entities pr. time unit. The whole manufacturing process contains on average 3 entities. Also, the throughput of the queue and the whole manufacturing system does not differ much (0.15 against 0.2). These measures can further be validated through simulation.

The implementation of this queueing system case in AnyLogic is as follows:

From the simulation model we already know the **source**–**seize**–**delay**–**release**–**sink** logic as described in the introduction of the article at hand. The main additions to the model are the **timeMeasure **blocks. Here, **tmS_Q **and **tmE_Q** ensure that the time used in the queue is measured, and **tmS_S **and **tmE_S **similarly ensure that the time used in the overall manufaturing system is measured.

We run the simulation model once and let 1,000 entities enter the system. We obtain the following:

ρ = **0.769**

Lq = **3.540**

L = **4.310**

Wq = **0.241**

W = **0.294**

The outputs are quite close to the values computed mathematically. However, if we run the multiplication several times (say 1,000 times) and average each metric over all the runs, we will come even closer to the calculated values. After running the simulation 1,000 times we get these values:

ρ = **0.7501**

L = **3.0092**

Lq = **2.2510**

W = **0.2015**

Wq = **0.1511**

These are approximately the same values as the ones calculated at the beginning of the case. This goes to show that the performance metrics in a** M/M/1** queue can be produced both by using a mathematical (i.e. analytical) approach and a simulation simulation. However, the simulation approach gets more precise when the simulation is run multiple times.

## M/M/c and M/Mc/k queueing systems

For **M/M/c** and **M/M/c/k** queues, we can apply the same practice as with** M/M/1** queues. The structure of a **M/M/c** queue is the same as in the **M/M/1** queue except that the capacity of the server is altered to a parameter **c**:

The structure of a M/M/c/k queue (which needs a few more alterations) is given below:

Here, we have added a “selectOutput” block called **cLim** (which is basically an if-statement in block-form). This block makes sure that if the system capacity is filled up, the incoming entity is not allowed to enter the system, and exits the system through **sink3**. The “restrictedArea” blocks, **rS** and **rE** sets the system capacity (recall that this corresponds to the **D** parameter in the queueing notation).

For obtaining the same queue metrics as with an M/M/1 queue for M/M/c and M/M/c/k, a wealth of equations can be used similarly. However the complexity and size of the equations is increased slightly.

## Other important aspects of queueing

Unfortunately, data does not always tell the whole story when it comes to understanding queueing systems. For example, if the server in the queueing system is a doctor or an assembly worker, then a 95-99% utilization would be considered exhausting and is probably not going to be sustainable in the long run. Another example might be that the additional cost of increasing capacity of the servers might be too expensive compared to the gains in lower average inventory and increased throughput. Rather, theory of queueing systems could be useful to simplify the problem at hand and make it clear what to prioritize in order to increase the system performance.

## Related content and links

In this article I introduced simulation as a method that can replace analytical solutions. This is not only true for queueing systems. If you are interested in getting to know more about simulation you might find below list of links interesting. Additional information on AnyLogic can be obtained by visiting https://www.anylogic.com/.

- Link:
*Simulation-based capacity planning* - Link:
*Simulation methods for SCM analysts* - Link:
*Monte-carlo simulation in Python* - Link:
*Procedure model for discrete-event simulation* - Link:
*Open-pit mine simulation for better planning* - Link:
*Agent-based simulation framework in Python* - Link:
*Monte-carlo animation in R with gganimate* - Link:
*Simple conveyor line models in AnyLogic* - Link:
*Monte-carlo simulation in R for warehouse location risk assessment* - Link:
*Backlog simulation of FIFO production system* - Link:
*Discrete-event simulation in R with simmer* - Link:
*Receival inspection process simulated with simmer* - Link:
*Visual Components financial KPI simulation* - Link:
*Machine learning and discrete-event simulation*

Professional analyst focusing on how simulation, optimization and applied statistics can be used in operations research using technologies such as Python, R, and Java.

## Leave a Reply