Optimal policy for controlling two-server queueing systems with jockeying

2022-03-01 07:16:08LINBingLINYuchenandBHATNAGARRohit

Journal of Systems Engineering and Electronics 2022年1期

LIN Bing, LIN Yuchen, and BHATNAGAR Rohit

1.School of Business, Jiangsu Normal University, Xuzhou 221116, China;2.Wenzheng College, Soochow University, Suzhou 215104, China;3.Nanyang Business School, Nanyang Technological University, Nanyang 639798, Singapore

Abstract: This paper studies the optimal policy for joint control of admission, routing, service, and jockeying in a queueing system consisting of two exponential servers in parallel.Jobs arrive according to a Poisson process.Upon each arrival, an admission/routing decision is made, and the accepted job is routed to one of the two servers with each being associated with a queue.After each service completion, the servers have an option of serving a job from its own queue, serving a jockeying job from another queue, or staying idle.The system performance is inclusive of the revenues from accepted jobs, the costs of holding jobs in queues, the service costs and the job jockeying costs.To maximize the total expected discounted return, we formulate a Markov decision process (MDP)model for this system.The value iteration method is employed to characterize the optimal policy as a hedging point policy.Numerical studies verify the structure of the hedging point policy which is convenient for implementing control actions in practice.

Keywords: queueing system, jockeying, optimal policy, Markov decision process (MDP), dynamic programming.

1.Introduction

Queueing models are now widely used to study the manufacturing systems, public service systems, distributed computer systems, data communication networks, traffic flow systems, healthcare operations management, etc.A queueing system typically consists of three components:jobs, queues, and servers.Here, a job could be a part, a telephone call, a data file, a computer program, a patient,or a plane.Correspondingly, a server could be a workstation in a plant, a telecommunication transmission channel,a data transmission channel, a CPU, a clinic, or a runway.

Controls are often applied to queueing systems to improve system performance.Queueing controls usually take the form of static (open-loop)control and dynamic (closeloop)control.For dynamic controls, we can dynamically change some parameters of queueing systems, such as arrival rate and service rate, or we may implement the rules for routing jobs to parallel servers upon job arrivals.

Queueing control problems have been extensively studied in literature.Here we mention a few significant works among the earliest ones.For the admission control models, readers may refer to [1-4] for detailed illustration.The research works [5-8] considered routing control problems while [9] and [10] studied the service rate control problems.For the joint admission and routing controls, [11] and [12] were among the earliest to study this issue.Comprehensive surveys on controlling queueing systems can be found in [13].

In this paper, we study the optimal joint control of admission, routing, service, and jockeying in a queueing system of two parallel servers.Jobs arrive according to a Poisson process.Upon each arrival, a system controller will decide which job is admitted into the system and to which server an admitted job is sent.Each server is associated with a queue with no capacity limit.Two exponential servers with distinct service rates are controlled in the following manner: once a service is completed, a server may stop service, or serve the job from its own queue, or serve a jockeying job from another queue.

The system performance is measured in terms of the revenues from the accepted jobs, the holding costs for jobs in queue, the service costs for processing jobs, and the jockeying costs associated with transferring jobs from one queue to another.To characterize the optimal control policy, we formulate a Markov decision process (MDP)model with an objective to maximize the total expected discounted return in infinite horizon.

Job scheduling and logistics and supply chain coordination are challenging as shown in [14-17].Our research is motivated by the example of managing global supplychains with multiple production bases since many transnational corporations maintain two production bases to fulfill their global operations strategies.For instance,Zara and H&M, global leaders in apparel industry, have one production base in Europe and the other in Asia.The well-known sportswear companies Nike and Adidas keep two production bases: one in China and the other in Southeast Asian countries.Haier, a leading manufacturer in home appliances, operates one production base in China serving the whole global market and the other in the US focusing on the American market.Even for local supply chains in China, our research can find potential applications.Suppose a manufacturing firm has two separate plants in the Pearl River Delta and the Yangtze River Delta, the two most developed areas in China.When customer orders arrive, the firm dispatches the orders to two plants.To fully exploit the production and logistics resources of two plants, the firm further allows order jockeying before their final entry into the production process.For instance, an order originally assigned to the plant in the Pearl River Delta can be conveniently reassigned to the plant in the Yangtze River Delta via information systems.Along with the above applications in operations management, our model can be applied to telecommunication networks, computer systems, and vehicular traffic flow.One specific example given in [18] is that of a multibeam satellite system that serves the earth-based stations organized into disjoint zones.For such a system,an effective routing and jockeying control rule is necessary to achieve efficient packet transmission.

The research on queueing systems with jockeying dates to the 1950s.However, most of the studies are classified into the descriptive models which focus on performance evaluation under specific jockeying rules.In this stream of research, [19-34] are typical works.The shortest queue problem with two parallel queues and threshold jockeying was studied in [19].Adan et al.[20] and Zhao et al.[21] further studied the case of multiple parallel queues.The special case of the shortest queue problem with instantaneous jockeying, i.e., jockeying occurs whenever one queue is shorter than others, was examined in [22-27].For more recent research on the shortest queue problem with jockeying, please refer to [28,29].For the applications, Jeganathan et al.[30] studied the jockeying in inventory management while Stagje [31]considered the jockeying of cost-conscious customer in service systems.Tarabia [32] examined jockeying in parallel queues in the case of restricted capacities.Baykasoglu et al.[33] modeled the jockeying problem associated with manufacturing systems and Chaleshtori et al.[34]analyzed the location-allocation problem with jockeying.

In contrast with numerous studies on controlling queueing systems in terms of admission, routing, and service, there are only a few on jockeying.Xu et al.[18]considered optimal control of routing and jockeying in a two-station queueing system which had a Poisson arrival of jobs and exponential service time at two stations in parallel.They formulated the queueing control problem as an MDP and used dynamic programming to characterize the optimal policy as a switching-curve policy for both discounted and long-run average cost criteria.Down et al.[35] studied a system of multiple parallel queues with each queue having a dedicated arrival stream.They first discussed the condition under which the policies yield a stable Markov chain.For the two-server case with Poisson arrivals and exponential service time, they formulated an MDP model to characterize the optimal policy which was used to further develop a heuristic policy for the general case.Rosberg et al.[36] investigated a problem of energy efficiency for stochastically assigning jobs in a server farm with multiple processor-sharing servers and finite buffer sizes.The cases of jockeying and no jockeying were considered.For the case with jockeying,the authors formulated the problem as a semi-MDP to derive the optimal assignment policy and two heuristic policies.Dehghanian et al.[37] examined the optimal joining and jockeying policy for a queueing system of two stations in parallel.Customers arrived in a Poisson stream and chose to join one of the two stations with one chance of jockeying to the other queue.The optimal individual policies for joining and jockeying were characterized as the monotone threshold policy with the problem formulated as an MDP to minimize total holding and jockeying costs.

Our research is closely related to [18].Both [18] and this paper use MDPs to characterize the optimal control policy for a system of two parallel queues with infinite buffer sizes.However, a significant difference exists between the two papers.Regarding models, only the routing and jockeying controls were studied in [18] which was motivated by a multibeam satellite system serving earth-based stations that were organized into disjoint zones.Since our research is oriented towards applications in operations management which are more concerned with closely matching supply with customer orders, we study a more complex model which includes joint controls of admission, routing, service, and jockeying.Consequently, the optimal policy in their paper was characterized as monotonically nondecreasing switching curves while the optimal policy in this paper was a hedging point policy characterized by one nonincreasing switching curve and one nondecreasing switching curve.Moreover, we consider unit jockeying, i.e., transferring only one job each time from one queue to another, while they studied the case of batch jockeying under which multiple jobs were transferred each time.Obviously, unit jockeying is more responsive than the batch jockeying.Therefore, unit jockeying is more likely to be implemented in the operations management systems which are often intended for quick response to customer orders.

To the best of our knowledge, the problem in this paper has never been studied before.Our main contribution is the incorporation of jockeying controls into queueing systems, the formulation of a general model for various applications as mentioned above, and the subsequent characterization of the optimal control policy.

The rest of the paper is organized as follows.In Section 2, we present the model formulation.The optimal policy is characterized in Section 3 and system analysis is conducted in Section 4.Numerical studies are given to illustrate our results in Section 5.Conclusions are drawn in Section 6.

2.Model formulation

The problem discussed in this paper is illustrated in Fig.1.

Fig.1 Queueing system with two servers in parallel

The system state isx= (x1,x2), wherex1andx2denote the number of jobs in two queues.Hence,x1andx2are nonnegative integers, i.e.,x= (x1,x2)∈X, where the state spaceXis a two-dimensional nonnegative integer set.Jobs arrive according to a Poisson process with rateλ.Upon arrival, each job is subject to admission control.The rejected jobs are lost, and an accepted job is routed to one of the two servers.The server-dependent revenues generated from the accepted jobs arer1andr2,respectively.The service times follow the exponential distributions with ratesμ1andμ2, respectively.Suppose the job holding cost function ish(x)=h1x1+h2x2, whereh1andh2are the unit holding cost per unit time at two queues.The unit service costs at two servers arec1andc2,respectively.The unit job jockeying costc12orc21is associated with transferring a job from Queue 1 to Queue 2 or from Queue 2 to Queue 1.Further, we assumeλ＜μ1+μ2which assures the stability of the system in the long run.

For the above-described queueing system, the admission/routing control actions are made only at the decision epochs when each job arrives while the service and jockeying control actions are made at the decision epochs when the service of a job is completed.Thanks to the memoryless property of the exponential inter-arrival time and service time distributions, the system evolution is only influenced by the control actions made at the decision epochs.Consequently, the system evolution forms a two-dimensional continuous-time Markov chain, and all decision epochs are the Markov renewal points of the process.Hence, we can restrict our attention to Markov policies since a Markov policy is the optimal.

Denote the control action bya= (a0,a1,a2), wherea0=0, 1, 2, indicating the control action of rejecting a job,routing an accepted job to Server 1, and routing an accepted job to Server 2, respectively;ai= 0, 1, 2 (i= 1, 2), indicating the control action of stopping service, serving its own job, and serving a jockeying job, respectively.Thus,the control actionatakes integer values within the finite set [0,1,2]×[0,1,2]×[0,1,2].An admissible policyuconsists of a sequence of functionsu= {u0,u1,u2, ···}∈U,whereUdenotes the set of all admissible policies.At each decision epochk(k= 0, 1, 2,···), the functionukmaps the statex= (x1,x2)into a control action, i.e.,uk(x)=afor allx∈X.

Letα(α ＞ 0)denote the continuous-time discount rate.For the infinite horizon model of the problem considered in this paper, given the initial statex= (x1,x2), the total expected discounted return associated with a policyu,which is denoted byJu(x), can be written as

whereMi(t),Ni(t)(i= 1, 2),N12(t), andN21(t)represent the number of jobs accepted to two servers, the number of service completed at two servers, the number of jobs transferred from Queue 1 to Queue 2, and the number of jobs transferred from Queue 2 to Queue 1 up to timet, respectively.Next, we seek a stationary policyu= {u∞,u∞,u∞,··}∈Uto maximizeJu(x).Thus, the optimal value functionVcan be written as follows:

The aboveVis then shown to satisfy the Bellman equation based on the dynamic programming optimality principle.Following the technique in [38], we uniformize the transition rate asΛ= λ+μ1+μ2.After the uniformization of transitions, the original continuous-time Markov chain is transformed into a probabilistically equivalent system observed at evenly spaced points in time.In other words, a random event occurs in the system with a rateΛand the event happens to be a job arrival with the probabilityλ/Λ, a service completion at Server 1 with the probabilityμ1/Λ, a service completion at Server 2 with the probabilityμ2/Λ.Next, we describe in sequel the system transitions considering the control actions.

Denotee1= (1, 0)Tande2= (0, 1)T.Given a state (x1,x2)and for each control actiona, the next state after transition is denoted byy.Then we can define the transition probability functionp(y|x,u)as

whereI{·} is the indicator function.For instance, the termimplies that an event of service completion happens at Server 1 with the probabilityand a job is transferred from Queue 2 to Queue 1 if the actiona1= 2 is selected, leading to a transition from the current statexto the next (x-e2).

In the following analysis, for the convenience of notations and analysis, we define the operatorsT0,T1,T2as follows:

whereT0,T1, andT2are the operators for admission/routing controls, service and jockeying controls at Server 1,and service and jockeying controls at Server 2, respectively.

Then, from the principle of optimality of dynamic programming, the optimal value functionVcan be shown to satisfy the following Bellman equation:

where the dynamic programming operator (or the Bellman operator)Tis as follows:

Furthermore, without loss of generality for analysis and computation, we can rescale the time to achieveα+Λ= 1.Hence,TVcan be simplified as

3.Optimal control policy

In this section, we explore the structure of the optimal policy.DefineVsas the set of functions onsuch that ifV∈Vs, then the following properties exist:

(i)V(x+e1)-V(x)↓x1↓x2;

(ii)V(x+e2)-V(x)↓x1↓x2;

(iii)V(x+e1)-V(x+e2)↓x1↑x2.

↑ and ↓ indicate non-decreasing and non-increasing, respectively.TheV(x+e1)-V(x)↓x1in (i)and theV(x+e2)-V(x)↓x2in (ii)refer to the discrete concavity inx1andx2, respectively.TheV(x+e1)-V(x)↓x2in (i)and theV(x+e2)-V(x)↓x1in (ii)are identical and referred to as the submodularity ofV(x).In some papers,(iii)refers to the subconcavity ofV(x), whereV(x+e1)-V(x+e2)↓x1impliesV(x+e1)-V(x+e2)≥V(x+ 2e1)-V(x+e1+e2)andV(x+e1)-V(x+e2)↑x2suggestsV(x+e1)-V(x+e2)≤V(x+e1+e2)-V(x+ 2e2).For brevity,we will use increasing (decreasing)and non-decreasing(non-increasing)interchangeably.Next result shows the existence of an optimal policy.

Proposition 1There exists an optimal deterministic stationary policy for (1).

ProofSince the state space is discrete and the control action set is finite for eachx∈Xin our model, we can find an admissible policyuto attain the maximum of the right-hand side of (1).According to Theorem 6.2.10 in [39],uis a stationary optimal policy.□

Then we show that all the operators defined above propagatethe structural properties(i)-(iii).

Lemma 1T0V,T1V, andT2VandTV∈V sifV∈V s.

ProofReaders may refer to [40] for the detailed proofs of such typical operations.□

Next lemma shows that the optimal value function retains the properties (i)-(iii).s

Lemma 2The optimal value functionV∈V.

ProofThe result is proved by value iteration.LetV0=0 which is inV s.Based onLemma 1, we applyTrepeatedly toV0, leadinng toTnV0∈V sfor alln.Asnapproaches infinity, (TV0)(x)takes the point-wise convergence to the optimal value functionV(x)for allx.Hence,Vretains all the structural properties (i)-(iii)based on the knowledge of mathematical analysis.Thus,V∈V s.□

Then we define the switching functions which are necessary to characterize the structure of the optimal policy.Thus, we have

From the above definitions, the switching functionsS1andS2are associated with the admission and routing decisions.The switching functionsLi(i= 1, 2, 3, 4)are associated with the service and jockeying controls at Server 1.And the switching functionsGi(i= 1, 2, 3, 4)are associated with the service and jockeying controls at Server 2.In the following discussion, we demonstrate that the switching functions are monotone with respect to the state variables.

Lemma 3S1(x1)is decreasing inx1andS2(x1)is increasing inx1;L1(x1)is decreasing inx1andL2(x1)is increasing inx1;G1(x1)is decreasing inx1andG2(x1)is increasing inx1.

ProofWe only prove the result ofS1(x1)and the rest can be proved analogously.

ForS1(x1), it can be regarded as a function of two parts made up from the following two respective functions:

that is,

Further, the concavity ofVimplies that

After comparing the above two inequalities, we haveHence,is decreasing inx1.Analogously, we can show thatis decreasing inx1.

Based on the above result, as in Fig.2, the left part of(the solid curve on the left side of the intersection point)and the right part of(the solid curve on the right side of the intersection point)are combined to formS1(x1)which is also decreasing inx1.

Fig.2 Hedging point policy for admission and routing controls

Other results are proved analogously.□

The above lemma states thatSi,Li, andGi(i= 1, 2)are monotone in the state variables.From now on, we can call them switching curves because each of them partitions the state space into two distinct decision regions.LiandGi(i= 3, 4)are the degenerate switching functions characterized by a hedging point, i.e., the so called critical point or threshold in other papers, which segment one axis (horizontal or vertical)into two decision parts.In Fig.2, the solid curveS1(x1)and the solid part ofS2(x1)partition the whole state space into three decision regions and the intersection point ofS1(x1)andS2(x1)is called the hedging point.The decision regions in Fig.3 are interpreted in the same manner.Next, we demonstrate that the hedging point policy as illustrated in Fig.2 and Fig.3 is optimal.

Fig.3 Hedging point policy for service and jockeying controls

Theorem 1The optimal policy for admission and routing controls is a hedging point policy characterized by the monotone switching curvesS1(x1)andS2(x1)(solid curves in Fig.2); the optimal policy for service and jockeying controls at Server 1 is a hedging point policy characterized by the monotone switching curvesL1(x1)andL2(x1)(solid curves in Fig.3)whileL3(x2= 0)andL4(x1=0)are for the degenerating cases when the states lie on the axes; the optimal policy for service and jockeying controls at Server 2 is also a hedging point policy characterized by the monotone switching curvesG1(x1)andG2(x1)(solid curves in Fig.3)whileG3(x2= 0)andG4(x1= 0)are for the degenerating cases when the states lie on the axes.Given a state (x1,x2), the optimal control actions are prescribed by the optimal policy as follows:

(i)Regarding the admission and routing controls, an incoming job will be accepted if and only ifx2＜S1(x1); otherwise reject it.An accepted job is routed to Server 1 ifx2≥S2(x1); otherwise, it is routed to Server 2.

(ii)For the service and jockeying controls at Server 1,there are four cases.

First, whenx1＞ 0 andx2＞ 0, the server remains active ifx2≥L1(x1); otherwise, stays idle.The active server processes the job from its own queue ifx2＜L2(x1); otherwise, processes a jockeying job from another queue.Second, whenx1＞ 0 andx2= 0, the server keeps serving its own jobs ifx1≥L3(x2= 0); otherwise, stays idle.Third, whenx1= 0 andx2＞ 0, the server keeps serving the jockeying jobs ifx2≥L4(x1= 0); otherwise, stays idle.Fourth, whenx1= 0 andx2= 0, the server stays idle.

(iii)For the service and jockeying controls at Server 2,there are four cases.

First, whenx1＞ 0 andx2＞ 0, the server remains active ifx2≥G1(x1); otherwise, stays idle.The active server processes the job from its own queue ifx2≥G2(x1); otherwise, processes a jockeying job from another queue.Second, whenx1＞ 0 andx2= 0, the server keeps serving the jockeying jobs ifx1≥G3(x2= 0); otherwise, stays idle.Third, whenx1= 0 andx2＞ 0, the server keeps serving its own jobs ifx2≥G4(x1= 0); otherwise, stays idle.Fourth, whenx1= 0 andx2= 0, the server stays idle.

ProofFor the admission and routing controls in (i),from Lemma 3,S1(x1)is decreasing inx1.According to the definition ofS1(x1), the submodularity and concavity of the optimal value function ensurer1+V(x+e1)-V(x)≤0 andr2+V(x+e2)-V(x)≤ 0 wheneverx2≥S1(x1).Hence, it is optimal to reject the job; otherwise accept some job wheneverx2＜S1(x1); moreover, whenx2＜S1(x1)andx2≥S2(x1), we haveV(x+e1)-V(x+e2)+r1-r2≥ 0, suggesting that it is optimal to route the accepted job to Server 1; whenx2＜S1(x1)andx2＜S2(x1), the inequalityV(x+e1)-V(x+e2)+r1-r2＜ 0 holds, implying that it is optimal to route the accepted job to Server 2.(ii)and (iii)on the service and jockeying controls are proved analogously.□

4.System analysis

In this section, we demonstrate that the system with jockeying control performs no worse than that without jockeying.Here, the model for the system without jockeying has the same parameters, i.e., arrival rate, service rates, revenues, holding costs and service costs, as those in Section 2.

DefineUas the optimal value function for the model without jockeying in which the operatorsT1andT2should be removed of the jockeying control terms.Hence, the modified operators denoted by,, andare

and

Next, we show the optimal value functionVassociated with jockeying control is no less thanU.First, we have the following lemma:

Lemma 4TV≥ifV(x)≥U(x)for allx∈X.

ProofSupposeV(x)≥U(x)for allx∈X.And we will showT1V≥,T2V≥, andT0V≥T0U.

To verifyT1V≥Tˉ1U, we prove it in four cases.First,forx1＞ 0 andx2＞ 0, sinceV(x)≥U(x)andV(x-e1)≥U(x-e1), we have

Forx1＞ 0 andx2= 0, obviously, we have

Forx1= 0 andx2＞ 0, it is evident that

Forx1= 0 andx2= 0, simply,V(x)≥U(x).

T2V≥can be proved analogously.

ForT0V≥T0U, sinceV(x)≥U(x),V(x+e1)≥U(x+e1), andV(x+e2)≥U(x+e2), we have

The above lemma states that the inequality is preserved after the operatorsandact onVandU, respectively.Then we can prove our main result.

Proposition 2V(x)≥U(x)for allx∈X.

ProofWe arbitrarily chooseV0(x)andU0(x), for instance,V0(x)=U0(x)= 0, to makeV0(x)≥U0(x)for allx∈X.From Lemma 4,.Further, repeatedly applyingTandtoV0andU0, respectively, yieldsTnV0≥for alln.By the standard result,TnV0→Vandin a manner of point-wise convergence asn→ ∞.Hence,V(x)≥U(x)for allx∈X.□

The above result demonstrates the value of jockeying by showing that the system with jockeying control performs at least no worse than that without jockeying.Next,we show that the system performance is monotone with respect to some of its parameters.

Proposition 3V↑λ↑μ1↑μ2↓c12↓c21

ProofFirst, we showV↑λ, i.e., the optimal values are nondecreasing in the arrival rate.Let the system update arrival rateλ′ ≥λand other parameters remain unchanged from (1)in Section 2.Now, for model formulations, both systems use the same uniformized transition rateΛ′ =λ′ +μ1+μ2.Without loss of generality, rescale time to achieveα+Λ′ = 1.Then the Bellman equation for the original system is modified as follows:

where

The Bellman equation for the system with a new arrival rate is written as

where

In (5)and (6),T0,T1, andT2remain the same forms as those in Section 2.

Next, to showV′≥V, we first prove thatT′V′≥TVifV′≥V.The result is readily proved by showing

Then let(x)=V0(x)=0 for allx∈X.Thus,T′≥TV0.ApplyT′andTrepeatedly, leading toT′n≥TnV0for alln.Asn→ ∞, we haveV′≥V, that is,V↑ λ.

The results thatVis nondecreasing in service rates and nonincreasing in job jockeying costs can be proved in the same manner as the above argument.□

5.Numerical studies

The following example is used to illustrate the hedgingpoint policy.Jobs arrive according to a Poisson process with the rateλ= 2.The service rates of two servers are set toμ1= 2 andμ2= 2, which assuresλ＜μ1+μ2.The revenues from the accepted jobs arer1= 7 for Server 1 andr2= 7 for Server 2.The unit holding cost rates are assumed to beh1= 1 andh2= 1.The service costs at two respective servers arec1= 2 andc2= 2.The jockeying costs arec12= 3 andc21= 3, respectively.

For Bellman equations, typically, we can apply the standard methods of value iteration, policy iteration and linear programming.However, the value iteration algorithm is in general the best computational method for solving large-scale Markov decision problems [41].Here,to solve (1), we employ the value iteration algorithm, i.e.,the dynamic programming approach, which reads

That is, if we apply the operatorTin (2)to any bounded functionV0repeatedly, we can finally attain the optimal value functionV.For our example, for simplicity,letV0(x)= 0 for allx∈X.The continuous-time discount rateαis usually a very small value.However, to achieve a fast convergence of the value iteration algorithm for our model, we letα= 0.1.

For the large state space of, we need to truncate the state space in computation.If the target optimal decisions associated with the discrete states [δ1,δ2] ×[δ3,δ4], where the integers δ2＞ δ1≥ 0 and δ4＞ δ3≥ 0,are desired, we can applynnumber of value iterations from an initial truncated state space [ (δ1-n)+,δ2+n] ×[(δ3-n)+,δ4+n]to achieve our goal since each iteration causes a state transition which could be a service completion at Server 1, i.e., ( δ1-1), a service completion at Server 2, i.e., ( δ3-1), a job routed to Server 1, i.e.,(δ2+1), and a job routed to Server 2, i.e., ( δ4+1).

The uniformization rate isΛ=λ+μ1+μ2= 6 and henceα+Λ= 6.1.From (2),

Applying the above operatorTtoV0(x)= 0 repeatedly with 500 times, we attain (T500V0)(x)forx∈ [ 0,15] ×[0,15], which is our target state.Then compare two optimal values at the origin point, i.e., (T500V0)(0,0)= 22.698 7 and (T499V0)(0,0)= 22.698 4 and find that ((T500V0)(0,0)-(T499V0)(0,0))/(T499V0)(0,0)= 0.001 5% which is sufficiently small.Hence, (T500V0)(x)is a suitable approximation toV(x).Correspondingly, the derived optimal decisions are computed and listed in Fig.4, Fig.5, and Fig.6,respectively.

Fig.4 Decisions for admission and routing controls

Fig.5 Decisions for service and jockeying controls at Server 1

Fig.6 Decisions for service and jockeying controls at Server 2

In Fig.4, “0”, “1”, and “2” denote the decisions of rejecting the job, routing a job to Server 1, and routing a job to Server 2, respectively.Whenever a job arrives, if the statexis found to be, for instance, at (3, 5), i.e., three jobs in Queue 1 and five jobs in Queue 2, the corresponding decision in Fig.4 is “1 ”, which suggests that we should accept the job and route it to Server 1.

The decisions of service and jockeying controls are listed in Fig.5 and Fig.6 where “0”, “1”, and “2” indicate stopping service, serving own jobs, and serving a jockeying job, respectively.

The structural properties (i)and (iii)ofV(x)are illus-trated in Fig.7 and Fig.8, respectively.For instance, in Table 1, the differenceV(x+e1)-V(x)is monotone nonincreasing inx1andx2, respectively.The similar interpretation applies to Fig.8.The diagram of property (ii)is analogous to that of property (i)and hence it is not shown here.

Fig.7 Monotone property of the value difference V(x +e1)- V(x)

Fig.8 Mmonotone property of the value difference V(x +e1)-V(x +e2)

Furthermore, we investigate the system performance associated with various system parameters.Comparisons of the optimal values at the origin are made under different scenarios and the corresponding results are presented in Table 1.

Table 1 System performance with different paremeters

From Table 1, we can identify that the system performance of no jockeying worsens by 27.87% in contrast with the case of jockeying.For Cases 1-5, we only change the jockeying costs while other parameters are fixed.It is obvious that the optimal values are decreasing with respect to the jockeying costs.For Cases 6-10, the arrival rates are different while other parameters remain unchanged.The results show that the optimal value is increasing in the arrival rate.This is intuitive because more incoming jobs could bring in more revenues.For Cases 11-15, we examine the system performance with varied service rates.Clearly, the performance improves as the service rate increases.

6.Conclusions

In this paper, we study the optimal control of queueing systems with a Poisson arrival of jobs and two parallel exponential servers.With the structure properties of the optimal value function, we characterize the optimal control policy as a hedging point policy of which two monotone switching curves intersected at the hedging point and typically partitioned the state space into three decisionzones.The simple structure of the optimal policy is convenient for implementing the control actions in practice.

One important contribution of this paper is the incorporation of jockeying controls into queueing systems in addition to the commonly used job admission, routing,and service controls.This queueing control system has not been addressed by previous researchers.Furthermore,our numerical studies confirm the structural properties of the optimal value function and the structure of the hedging point policy and demonstrate the value of jockeying control for our queueing system.Besides, we compare system performances associated with varied system parameters and verify some monotonicity results which might offer insights into the design and operation of such queueing systems.

Although we give an analysis on the queueing system of two parallel servers, it would be interesting to study jockeying control problems with more than two servers.

Journal of Systems Engineering and Electronics2022年1期

Journal of Systems Engineering and Electronics的其它文章: Deep neural network based classification of rolling element bearings and health degradation through comprehensive vibration signal analysis; System reliability evaluation method considering physical dependency with FMT and BDD analytical algorithm; Sliding mode control of three-phase AC/DC converters using exponential rate reaching law; On-line trajectory generation of midcourse cooperative guidance for multiple interceptors; Robust adaptive control of hypersonic vehicle considering inlet unstart; Civil aircraft fault tolerant attitude tracking based on extended state observers and nonlinear dynamic inversion