OPERAT ONS RESEARCH inform ® Vol. 56 No. 5 September–October 2008 pp. 1184–1199 issn 0030-364X · eissn 1526-5463 · 08 · 5605 · 1184 doi 10.1287/opre.1080.0580 © 2008 INFORMS INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. Approxi ation Algorith s for Capacitated Stochastic Inventory Control Models Retsef Levi Sloan School of Management Massachusetts Institute of Technology Cambridge Massachusetts 02139 retsef@mit.edu Robin O. Roundy Mission Church of Jesus Christ of Latter Day Saints Apartado Aereo Barranquilla Colombia robin@orie.cornell.edu David B. Sh oys School of Operations Research and Information Engineering and Department of Computer Science Cornell University Ithaca New York 14853 shmoys@cs.cornell.edu Van Anh Truong Fixed Income Global Modelling and Analytics Group Credit Suisse Securities New York New York 10010 van.anh.truong@credit-suisse.com We develop the first algorithmic approach to compute provably good ordering policies for a multiperiod capacitated stochastic inventory system facing stochastic nonstationary and correlated demands that evolve over time. Our approach is computationally efficient and guaranteed to produce a policy with total expected cost no more than twice that of an optimal policy. As part of our computational approach we propose a novel scheme to account for backlogging costs in a capacitated multiperiod environment. Our cost-accounting scheme called the forced arginal backlogging cost-accounting sche e is significantly different from the period-by-period accounting approach to backlogging costs used in dynamic programming; it captures the long-term impact of a decision on system performance in the presence of capacity constraints. In the likely event that the per-unit order costs are large compared to the holding and backlogging costs a transformation of cost parameters yields a significantly improved guarantee. We also introduce new semimyopic policies based on our new cost-accounting scheme to derive bounds on the optimal base-stock levels. We show that these bounds can be used to effectively improve any policy. Finally empirical evidence is presented that indicates that the typical performance of this approach is significantly stronger than these worst-case guarantees. Subject classifications: stochastic inventory control; heuristics; approximation algorithms. Area of review: Manufacturing Service and Supply Chain Operations. History: Received August 2005; revisions received July 2006 December 2006 May 2007 September 2007 October 2007; accepted October 2007. 1. Intr ducti n The periodic-review capacitated inventory control problem for systems facing stochastic nonstationary (time-dependent) demands that are correlated and evolve over time is an important classical problem that is widely recognized to be computationally challenging. We develop a new algorithmic approach to compute the order quantity for such a system. We build on the work of Levi et al. (2007) who used a marginal holding cost accounting scheme and cost-balancing techniques to derive the first policies with worst-case performance guarantees for uncapacitated models. In this paper we introduce a novel marginal backlogging cost-accounting scheme that in combination with their techniques leads to analogous results for the much- harder capacitated model. We believe that our new cost- accounting scheme will have applications in many other settings. Our algorithm is guaranteed to compute a solution of total expected cost of no more than twice that of an optimal policy for any instance of the problem. The algorithm is computationally efficient and implementable without having to exhaustively enumerate future scenarios and corresponding future decisions. In particular the decision made in the current period is unaffected by any future decision. Thus it can be implemented efficiently even in the presence of complex demand structures. Specifically we consider single-item models with one location and a finite planning horizon of T discrete time periods. The demands over the T periods are random variables that can be nonstationary and correlated. The costs consist of a per-unit time-dependent ordering cost a per- unit holding cost for carrying excess inventory from period to period and a per-unit backlogging cost which is a penalty incurred in each period for each unit of unsatisfied demand (where all shortages are fully backlogged). There is a time-dependent capacity constraint on the number of 1184 INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. units ordered in each period and a lead time between the time that an order is placed and the time that it actually arrives. The capacity constraints and lead times may be stochastic. Capacitated problems are inherently more difficult computationally compared to their uncapacitated counterparts. The constraint on capacity makes future costs heavily dependent on current decisions. Myopic policies which do not consider the impact of a decision made in the current period on the costs incurred in future periods seem to perform well for some scenarios in uncapacitated systems and are even optimal in some settings (see Veinott 1965 Ignall and Veinott 1969 Iida and Zipkin 2006 Lu et al. 2006). However when applied to capacitated problems they usually perform very poorly because they do not consider possible capacity limitations in future periods. In this work we introduce a look-ahead backlogging cost-accounting scheme called the forced arginal backlogging cost-accounting sche e to capture the long-term impact of current decisions on future costs in the presence of capacity constraints. Our new cost-accounting scheme assigns to the decision in each period all of the expected backorder costs that once this decision is made become inevitable; that is they are unaffected by any decision made in future periods and are dependent only on future demands. The forced marginal backlogging cost reduces to the traditional backlogging cost when the capacity is infinite; thus it is a generalization of the traditional backlogging cost. Finally as discussed in §3.1 it is straightforward to compute in most common scenarios. The key feature distinguishing the algorithms presented in this paper from those previously studied for capacitated systems is the treatment of correlation in demand across time as well as nonstationarity. Moreover we allow observations of the past to change demand forecasts for the future. Our model also captures other important characteristics of a nonstationary environment: the parameters are fully time dependent including cost parameters and system capacity. An important application of demand correlation and nonstationarity is in the use of dynamic demand forecasts. These forecasts and the way they evolve over time provide vital information that can be used to find effective inventory control policies in dynamic and highly volatile demand environments. The assumptions that we make on the demand distributions in this work are mild enough to generalize all of the currently known approaches in the literature to model correlation and nonstationarity of demand over time. These include classical approaches such as the artingale odel of forecast evolution (MMFE) exogenous Markovian demand time series order-one autoregressive demand and random walks. For an overview of the different approaches and models and for relevant references we refer the reader to Iida and Zipkin (2006) and Dong and Lee (2003) and Özer and Wei (2004). High correlation between demands across different periods in nonstationary and dynamic environments presents a considerable challenge to computing or even approximating optimal inventory control policies. The dominant paradigm in almost all of the existing literature has been to formulate multiperiod capacitated models using dynamic programming. The optimization problem is defined recursively over time by using subproblems for each possible state of the system. The state usually consists of a given time period the level of inventory at the beginning of the period the resulting conditional distribution of future demands over the rest of the horizon and possibly more information that is available by that period. For each subproblem an optimal solution is computed to minimize the expected overall discounted cost from the current point to the end of the horizon. This framework has turned out to be very effective in characterizing the structure of the optimal policy of the overall system. Assuming stationary linear costs and independent and identically distributed (i.i.d.) demands Federgruen and Zipkin (1986a b) showed that a modified base-stock policy is optimal under infinite- horizon average cost and discounted cost criteria. They established the existence of a target inventory level such that the optimal policy aims to keep inventory levels as close as possible to that target. When the inventory level at the beginning of the period is above the target level the optimal policy does not order. When the inventory level at the beginning of the period is below the target level it might not be possible to order up to the target level because of the capacity constraint. In this case the order placed would be up to capacity. Tayur (1992) Kapuscinski and Tayur (1998) and Aviv and Federgruen (1997) derived the optimal policy in the same settings but for independent cyclical demands. More recently Özer and Wei (2004) showed the optimality of modified base-stock policies in single-item periodic-reviewed capacitated inventory control models with advance demand information. Axsäter (1990) is the first to introduce the notion of atching between pairs of demand and supply units. Specifically he observes that in a distribution system with a single depot and multiple retailers a supply unit ordered by a retailer can be used to fill a corresponding demand unit following a certain order. He matches this pair of units and evaluates the corresponding expected holding cost. Katircioglu and Atkins (1996) have used this observation to analyze the optimal policies in unit demand inventory systems. For the uncapacitated periodic-review stochastic inventory control problem Muharremoglu and Tsitsiklis (2008) have proposed an alternative approach to the dynamic programming framework. They have observed that this problem can be decoupled into a series of unit supply- de and subproble s where each subproblem corresponds to a single unit of supply and a single unit of demand that are matched. This novel approach enabled them to substantially simplify some of the dynamic-programmingbased proofs on the structure of optimal policies as well as to prove several important new structural results. In INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. particular they have established the optimality of state- dependent base-stock policies for the uncapacitated model with general Markov-modulated demand. Using this unit decomposition they have also suggested new methods to compute the optimal policies. However their computational methods are essentially dynamic programming approaches applied to the unit subproblems; hence they suffer from similar problems in the presence of correlated and non- stationary demand. Although our approach is very different from theirs we use some of their ideas as technical tools in some of the proofs. Janakiraman and Muckstadt (2003) have extended this approach to capacitated models and established the optimality of state-dependent modified base-stock policies for models with Markov-modulated demand. Unfortunately the rather simple forms of these policies do not always lead to efficient algorithms for computing the optimal policies. Complex demand structures such as the one we consider in this work cause the state space of the corresponding dynamic programs to explode (see Iida and Zipkin 2006 Dong and Lee 2003 for relevant discussions on the MMFE model). There does not exist at present nor is there likely to be developed an efficient algorithm to solve these dynamic programs to optimality even for the uncapacitated model. The difficulty comes from the fact that we need to solve “too many” subproblems a phenomenon known as the curse of di ensionality. To date computational procedures have been made tractable only under assumptions of simple demand structures. If the demands in different periods are independent the corresponding dynamic programs are relatively straightforward to solve. Dynamic programming can still be tractable for uncapacitated models with Markov-modulated demand but under rather strong assumptions on the structure and the size of the state space of the underlying Markov process (see for example Song and Zipkin 1993 Chen and Song 2001). Tayur (1992) uses the shortfall distribution and the theory of storage processes to derive an efficient computational method for computing the optimal policy in the stationary cost i.i.d. demand average cost case. Roundy and Muckstadt (2000) showed how to obtain approximate base- stock levels also for the stationary cost and i.i.d. demand case by approximating the distribution of the shortfall process. Kapuscinski and Tayur (1998) proposed a simulation- based technique using infinitesimal perturbation analysis to compute the optimal policy for capacitated problems with independent cyclical demands. Finally Özer and Wei (2004) developed an exact dynamic programming approach for the model with advance demand information when the forecast horizon exceeds the lead time by two periods and so the state space is only two dimensional. There have been heuristic approaches to compute order quantities for capacitated problems. However we are aware of very few attempts to analyze the worst-case performance of heuristics and most bounds derived are dependent on the particular input (see for example Lu et al. 2006). To the best of our knowledge there are no other policies for stochastic inventory control models with constant worst-case performance guarantees. Metters (1997) found heuristics for capacitated lost-sales problems with independent cyclical demands. Chan (1999) have considered heuristics for uncapacitated and capacitated multi-item models. Instead of solving the one-period problem (as in the myopic policy) they have added a penalty function to the one-period problem which they called the Q-function. This function accounts for the holding cost incurred by the inventory left at the end of the period over the entire horizon. Their look-ahead approach with respect to the holding cost is somewhat related to our approach although significantly different. As we have already mentioned this paper builds on the work of Levi et al. (2007). They give the first algorithms with a constant performance guarantee for the uncapacitated stochastic inventory control model with correlated nonstationary demands; specifically their algorithms always find solutions of total expected cost no more than twice the optimal. Their algorithms are based on two main ideas. First they construct a look-ahead holding-cost accounting scheme called the arginal holding-cost accounting sche e to compute the additional holding costs incurred by units ordered in the current period throughout the entire horizon. Second they use cost-balancing techniques in that in each period they order exactly to balance the following two opposing costs: the conditional expected marginal holding cost against the conditional expected period backlogging cost a lead time ahead. Their approach relies heavily on the ability of the system to order in each period a “balancing quantity” that equalizes the expected marginal holding cost and the expected backlogging cost in the period. In capacitated systems the approach fails because this balancing quantity might not be attainable due to capacity constraints. Our forced marginal backlogging cost- accounting scheme is designed to remedy this problem by reassigning backlogging costs more appropriately to the decisions that create them enabling us to find a “balancing order quantity” for capacitated systems. Suppose that in the current period the order placed was not up to capacity; we wish to account for the potential backlogging cost in future periods incurred directly by the decision not to use the full available capacity. Assume temporarily that we order up to capacity in each one of the periods. Suppose now that in the current period we do not order up to capacity. Then the expected marginal backlogging cost associated with the current period is the overall increase in the expected backlogging cost over the entire horizon resulting from this decision. In this way our balancing policy for a capacitated system is able to achieve the same worst-case performance guarantee of two with surprisingly little additional computational effort. When applied to uncapacitated models the policies described in this paper are identical to the dual-balancing policies described by INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. Levi et al. (2007). Thus they can be viewed as generalizations of the original dual-balancing policies to capacitated inventory models. We also use the marginal holding and forced marginal backlogging cost-accounting schemes to derive additional semimyopic policies called the lower- yopic and upper- yopic policies. The policies provide lower and upper bounds on the optimal base-stock levels respectively that can be used in conjunction with any policy to achieve lower expected cost. Furthermore in §4.2 we show how to use standard cost transformations to improve the performance of the algorithms in many important settings (see also Levi et al. 2007). These transformations yield a modified instance of the problem that is equivalent to the original one from an optimization perspective but models only holding and backlogging costs. If the per-unit ordering cost is constant over time then applying our algorithms to the modified instance yields an approximation algorithm with a worst- case guarantee of two with respect to the holding and backlogging costs and which has the same total per-unit ordering cost as the optimal policy. More generally when the ordering costs are large the worst-case performance guarantee of the modified cost dual-balancing policy will be much better than two. In §6 we test the typical empirical performance of the balancing algorithms in two settings. We consider an inventory system that has i.i.d. demand (no correlations) and a demand distribution with an exponential tail because the optimal policy can be computed analytically. (The motivation is to test the balancing policies at least in one environment in which the optimal policy and cost are known.) However balancing policies are most attractive in scenarios with complex demand structures whereas optimal policies cannot be computed and no provable good heuristics or reasonable lower bounds are known. Thus we also consider the same set of test scenarios tested in Hurley et al. (2006) in which the uncapacitated versions of these algorithms were evaluated computationally. In these scenarios the demands and forecasts evolve according to the multiplicative MMFE model. Optimal policies are not computable and strong lower bounds on the optimal cost do not exist so we used the myopic policy as a benchmark for evaluating performance. The performance of the balancing policies is very robust. It was within 11% of optimal on average in the first test (always within 25%) and consistently improved upon myopic by over 27% on average and by over 50% in many scenarios. This paper is organized as follows. In §2 we present the mathematical formulation of the periodic-review capacitated stochastic inventory control problem. In §3 we describe the forced marginal backlogging cost-accounting scheme for the capacitated model. In §4 we describe the balancing policy and its worst-case analysis. We also extend the approximation results to the case of discrete demand and stochastic lead times (see Appendix C in the online appendices which are available as part of the online version that can be found at http://or.pubs.informs.org/.). In §5 we develop lower and upper bounds on the optimal inventory levels and show how to use them to improve any policy. Section 6 contains our computational results. Online Appendix A contains a very simple illustrative example for the case of integer-valued demand. In Online Appendix B we present a detailed description of the scenarios tested in the computational results. 2. Capacitated Peri dic-Review St chastic Invent ry C ntr l Pr blem In this section we provide the mathematical formulation of the capacitated periodic-review stochastic inventory problem and introduce some of the notation used throughout the paper. As a general convention we distinguish between a random variable and its realization using capital letters and lower-case letters respectively. Script font is used to denote sets. We consider a finite planning horizon of T periods numbered t = 1 T (note that t and T are both deterministic unlike the convention above). The demands over these periods are random variables denoted by D1 DT . As part of the model we assume that at the beginning of each period s we are given what we call an inforation set that is denoted by s . The information set s contains all of the information that is available at the beginning of time period s. More specifically the information set s consists of the realized demands (d1 ds−1) over the interval [1 s) and possibly some more (external) information denoted by (w1 ws ). The information set s in period s is one specific realization in the set of all possible realizations of the random vector Fs = D1 Ds−1 W1 Ws . This set is denoted by s . In addition we assume that in each period s there is a known conditional joint distribution of the future demands (D ) denoted by I = I which is determined sDT sss by (i.e. knowing we also know I ). For ease ss ss of notation Dt will always denote the random demand in period t according to the conditional joint distribution Is for some s t where it will be clear from the context to which period s we refer. We will use t as the general index for time and s will always refer to the current period. The only assumption on the demands is that for each s = 1 T and each ∈ the conditional expectation E D · is ss ts well defined and finite for each period t s. In particular we allow nonstationarity and correlation between the demands of different periods. In the periodic-review stochastic inventory control problem our goal is to supply each unit of demand while attempting to avoid ordering it either too early or too late. In period t (t = 1 T ) three types of costs are incurred a per-unit ordering cost ct for ordering up to ut units where ut 0 is the available order capacity in period t; a per-unit holding cost ht for holding excess inventory from period t to t + 1; and a per-unit backlogging penalty pt INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. that is incurred for each unsatisfied unit of demand at the end of period t. Unsatisfied units of demand are usually called backorders. Backorders accumulate fully over time until they are satisfied. That is each unit of unsatisfied demand will stay in the system and will incur a backlogging penalty in each period until it is satisfied. In addition there is a lead time of L periods between the time an order is placed and the time at which it actually arrives. We first assume that the lead time is a known integer L. In Online Appendix C we show that our policy can be modified to handle stochastic lead times under the assumption of no order crossing (i.e. any order arrives no later than those placed later in time). In §4.1 we show that extensions to the case of random capacities are straightforward. There is also a discount factor 1. The cost incurred in period t is discounted by a factor of t . Because the horizon is finite and the cost parameters are time-dependent we can assume without loss of generality that = 1. We also assume that there is no speculative motivation for holding inventory or having backorders in the system. To enforce this we assume that for each t = 2 T − L the inequalities ct ct−1 + ht+L−1 and c ct+1 + p are maintained tt+L (where cT +1 = 0). (If there is a discount factor we require that c ct−1 + Lh and c ct+1 + Lp .) We tt+L−1 tt+L also assume that the parameters h p and c are all tt t nonnegative. Note that the parameters hT and pT can be defined to take care of excess inventory and backorders at the end of the planning horizon. In particular pT can be set sufficiently high so as to ensure that there are very few backorders at the end of period T . The goal is to find a feasible ordering policy (i.e. one that respects the capacity constraints) that minimizes the overall expected discounted ordering cost holding cost and backlogging cost. We consider only policies that are nonanticipatory i.e. at time s the information that a feasible policy can use consists only of s and the current inventory level. Throughout the paper we will use Dst to denote the accumulated demand over the interval [st] i.e. Dst = t j=s Dj . We will also use superscripts P and OPT to refer to a given policy P and an optimal policy respectively. Given a feasible policy P we describe the dynamics of the system using the following terminology. We let NIt denote the net inventory at the end of period t which can be either positive (in the presence of physical on-hand inventory) or negative (in the presence of backorders). Because we consider a lead time of L periods we also consider the orders that are on the way. The sum of the units included in these orders added to the current net inventory is referred to as the inventory position of the system. We let Xt be the inventory position at the beginning of period t before the t−1 order in period t is placed i.e. Xt = NIt−1 + j=t−L Qj (for t = 1 T ) where Qj denotes the number of units ordered in period j (we will sometimes denote tj − = 1 t−L Qj by Qt−Lt−1 ). Similarly we let Yt be the inventory position after the order in period t is placed i.e. Yt = Xt + Qt . Note that once we know the policy P and the information set ∈ we can easily compute nis−1 x and y where ss ss again these are the realizations of NIs−1 Xs and Ys respectively. 3. Marginal C st Acc unting Scheme In this section we present a arginal cost accounting sche e for stochastic inventory control problems with capacity constraints on the size of the order in each period. This extends and generalizes the marginal cost accounting discussed by Levi et al. (2007). Because this cost-accounting approach is central for our approximation results we explain it in detail repeating some of the ideas of that paper. Our approach differs significantly from the traditional cost-accounting approaches which are based on standard dynamic programming. We start by reviewing their cost-accounting approach which is called arginal cost accounting. The main idea underlying this approach is to account for all the expected costs associated with the decision of how many units to order in period t when this decision is made. More specifically the decision in period t is associated with all the expected costs that after that decision is made become unaffected by any future decision and are only dependent on future demands. In Levi et al. (2007) it was shown that in uncapacitated models these costs are relatively easy to compute already in period t even though they may include costs that are going to be incurred only in future periods. Taking this approach Levi et al. have proposed a marginal holding-cost accounting scheme. Their approach is based on the convention that units in inventory are consumed on a first-ordered-first-consumed basis. This implies that the overall holding cost of the qs units ordered in period s (i.e. the holding cost they incur over the entire horizon sT ) is a function only of future demands and is independent of any future decision. Based on the assumption that inventory is consumed on a first-ordered-first-consumed basis the qs units on order will be used to satisfy demand only when the xs units presently in the system have been completely consumed. Among these qs units the number of those still remaining in inventory at the end of period j (where j s + L) is precisely qs − Ds j − xs ++ . Each of these units incurs a cost of hj . More specifically conditioning on an information set s ∈ s the marginal holding cost is defined to be (assuming again T that = 1) hj q − D − x ++ . Observe again j=s+L s sjs that for each nonanticipatory policy P if conditioned on some t ∈ t the inventory position at the beginning of period t denoted by xtP is known deterministically. In addition once the order in period s is determined the backlogging cost a lead time ahead in period s + L i.e. p D − x + q + is also dependent only on the s+L ss+L ss future demands. This leads to a marginal cost accounting. For each feasible policy P let Ht P be the ordering and holding cost incurred over the interval tT by the Qt P INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. units ordered in period t (for t = 1 T ) andlet P t be the backlogging cost incurred a lead time ahead in period t + L (t = 1 − LT − L). That is Ht P = ctQt P + T + P j=t+L hj Qt P − Dt j − Xt P + and t = pt+L Dt t+L − XP + QP + (where Dj = dj with probability one and t+Lt QP is given as an input for each j 0). Let P be j = qj the cost of the policy P. Clearly 0 T −L P HPP P =+ H −+ + (1) tT tt t=1−Lt=1 where H − T denotes the total expected holding cost incurred over the interval 1 T by units ordered before 0 period 1. We note that the first two expressions P t=1−Lt and H − T are not affected by our decisions (i.e. they are the same for any feasible policy and each realization of the demands). Note that without loss of generality we can assume that Qt P = Ht P = 0 for any policy P and each period t = T −L+1 T because nothing that is ordered in these periods can be used within the given planning horizon. In models with no capacity constraints there is a fundamental difference between holding cost and backlogging cost. In particular any mistake of ordering “too little” can be fixed in the next period to avoid further backlogging cost. In particular the decision of how many units to order affects the backlogging cost in a single period. However the effect of this decision if we have ordered “too much ” may last for a number of periods depending on the realized future demands. That is no future decision can fix this mistake because we cannot order a negative quantity. Consequently P t only accounts for costs incurred in a single period namely backlogging cost in period t + L and Ht P accounts for holding costs incurred over multiple periods. By way of contrast in models with capacity constraints on the size of the order in each period the above observation is no longer valid. More specifically because of the capacity constraints it is no longer true that a mistake of ordering “too little” in the current period can always be fixed by decisions made in future periods. 3.1. Marginal Backl gging C st Acc unting We now present a new backlogging cost accounting that associates with the decision of how many units to order in period s what we shall call forced backlogging cost resulting from this decision in future periods. Consider some period s. Suppose that xs is the inventory position at the beginning of period s and that the number of units ordered in the period is qs 0. s ss Focus now on some future period t s + L when this order arrives and becomes available. Suppose that for some realization of the demands we have that d − x + q + st ss j∈ st−Luj > 0. This implies that there exists a shortage in period t and moreover even if in every period after period s and until period t − L the orders had been up to the maximum available capacity this part of the shortage in period t would still exist and incur the corresponding backlogging cost. The actual shortage may be even big ger and equal to ds t − xs + qs + j∈ st−Lqj > 0 (recall that qj uj for each period j). In other words given our decision in period s this part of the shortage could not be avoided by any decision made over the interval st − L (clearly any order placed after period t − L will not be available by time t). We conclude that if more units had been ordered in period s then at least some of the shortage in period t could have been avoided. More precisely the maximum number of units of shortage that could have been avoided by ordering more units in period s is equal to min q s dst − xs + qs + j∈ st−L uj + . The intuition is that by ordering more units in period s we could have averted part of the shortage in period t but clearly not more than the unused slack capacity q s because we could not have ordered in period s more than additional q s units. In this case we would say that this part of the backlogging cost in period t was forced by the decision in period s and hence period s is associated with a backlogging penalty + of pt min q s dst − xs + qs + j∈ st−L uj . This is significantly different from the traditional backlogging cost accounting in which this cost would be associated with period t − L. We let Wst be the shortage in period t that is forced by the decision in period s (where again s t − L) i.e. + Wst = min Qs Dst − Xs + Qs + uj j∈ st−L An alternative way to express Wst using min a b += b +− + b − a for a ∈ R+ and b ∈ R is + W = D − X + Q + uj st stss j∈ st−L + − Ds t − Xs + uj (2) j∈ st−L Now using the equalities NIt = Xs +Qs + j∈ st−L Qj − Ds t (for each s t − L) and uj = Qj + Qj (for each j = st − L) we conclude that Equation (2) can be written as + + Dt −NIt − Qj − Dt −NIt − Qj (3) j∈ st−Lj∈ st−L To see why (2) (and hence (3)) holds observe that + Ds t − Xs + Qs + j∈ st−L uj >Qs if and only if D − X ++ > 0. Next we describe sev st sj∈ st−L uj eral properties of the parameters Wst. Clearly if Qs = 0 (i.e. Q = u ) then W = 0 for each t s + L. It is also ss st readily verified from (3) that if Wst > 0 for some s t − L then we have Wjt = Qj for each j ∈ st − L . For each s = 1 − LT − L let P be the overall Figure 2. Forced marginal backorder cost accounting. s forced backlogging cost in periods s + LT associated P TWP 100 with period s i.e. = (we again assume st=s+L pt st 90 Capacity qs Forced marginal backlogging cost at s Cumulative demand Cumulative supply that Dj = dj with probability one for each j 0). Let 80 u−L = q−L = 0 and q −L = and also define for each t = 1 T + Demand/supply 70 60 50 40 30 20 W−Lt = D 1−Lt − x1−L + INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. uj j∈ 1−Lt−L + = Dt − NIt − Qj j∈ 1−Lt−L T and P −L =−L = t=1 ptW−Lt. The last definition of −L is meant to account for the forced backlogging cost which is independent of any decision and is forced by the demands on any feasible policy. It is now readily verified that for each t = 1 T and for each pol P + t−L icyP wehave t−L = pt Dt − NIP t = pt j=−L Wjt P (the sum t−L is telescopic). This implies the following j=−L Wjt theorem. heorem 1. Let P be a n nanticipat ry p licy. Then, the 0 c st f p licy P can be expressed as P = P + t=−Lt T −L + HP + P . H − Tt=1 tt Note that the first two terms of P in Theorem 1 0 P and H − T are independent of any decision we t=−Lt make and are common to all feasible policies. Recall that 0 P represents the forced backlogging penalty that t=−Lt is forced on any feasible policy. Because these two terms are also nonnegative we omit them from the analysis. This does not impact our approximation results. From now on we will write the cost of a feasible policy P as P = T −L HPP + . In Online Appendix A we provide an t=1 tt illustrative example of our new cost-accounting approach. The intuition is that once a shortage is incurred in period t it is allocated to past periods s t − L in which the orders were below the available capacity. More specifically the shortage and the resulting backlogging cost in period t are charged to periods s t − L with positive unused slack capacity going backward in time from Figure 1. Period-by-period backorder cost accounting. 100 90 10 0 … Time 1 s–3 s–2 s–1 s period t − L. Each period s t − L can be charged with a part of the backlogging cost in period t for up to q s units the unused slack capacity in period s. Figures 1 2 and 3 graphically illustrate the difference between classical period-by-period accounting and forced marginal accounting for backlogging costs. All three figures reflect a single sample path of demands and orders. The total backlogging cost over the horizon is the area above the cumulative supply curve (thick line) and below the cumulative demand curve (thin line). Classical period-by-period accounting assigns to period s the difference between the curves at s (see Figure 1). Forced marginal accounting of backlogging costs assigns to period s all of the backlogging costs that were “forced ” or made inevitable because we did not order to capacity in period s. This corresponds to the area inside of the trapezoid shown in Figure 2. This trapezoid is created by extending the cumulative supply curve starting at s− 1 and at s to the right at a slope equal to the capacity of the system. These lines represent what the supply curves would look like if our policy consistently ordered at full capacity from s − 1 and s onwards respectively. In fact consider the thick short bars in the trapezoid in Figure 2. The first and second terms of (2) are the vertical coordinates of the end points of these bars. Conse- Figure 3. Allocation of a period backorder to ordering decisions. Forced marginal backlogging cost at sattributed to s–2, s–1, s 100 90 1 s–1 Ws–2 s Ws–1 s WssCapacity Cumulative demand qs Cumulative supply Capacity Cumulative demand Cumulative supply qs Classical backlogging cost at s 12 3… s–1 s 80 80 Demand/supply Demand/supply 70 60 50 40 70 60 50 40 30 30 20 20 10 10 0 0 … s–3 s–2 s Time Time INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. quently each Wst for t>s is the length of one of these bars. Figure 3 takes a different point of view. It considers the backlogging costs incurred in period s and illustrates how those costs are allocated to periods ss − 11 −L. In Levi et al. (2007) it is shown that the marginal holding cost consists of a sum of partial expectations. Once xs is known at time s the summands are expectations of simple piecewise-linear functions. If the accumulated demand Ds j (for each j s) has any of the distributions that are commonly used in inventory theory (e.g. Normal Gamma Lognormal Laplace etc.) (Zipkin 2000) then it is extremely easy to evaluate these terms. If the distribution of Ds j is discrete these functions can be computed recursively in efficient ways using the c.d.f.s. More generally the complexity of evaluating the marginal holding cost can vary depending on the level of information we assume on the demand distributions and their characteristics. In all of the common scenarios there exist straightforward methods to solve this problem efficiently (see also Hurley et al. 2006 for more details). Because in the presence of positive lead times even computing a simple myopic policy requires the same knowledge on the distribution of the accumulated demand over the lead time the computational effort involved with computing the marginal holding cost is of the same order of magnitude as for the myopic policy. Evaluating the marginal backlogging costs based on the scheme developed in this paper is analogous to the marginal holding cost. It is a sum of partial expectations of simple piecewise-linear functions and therefore is no more difficult to compute. Finally observe that for uncapacitated models with us = for each s (and hence q s = ) our backlogging cost accounting is in fact identical to the traditional backlogging accounting discussed above. This implies that the cost-accounting scheme proposed in this paper is a generalization of the one introduced in Levi et al. (2007). Therefore the preceding discussion is also a generalization of the corresponding algorithm and analysis in Levi et al. (2007). 4. Dual-Balancing P licy In this section we describe a new policy for the capacitated periodic-review stochastic inventory control problem. As in Levi et al. (2007) we call it a dual-balancing policy. We shall show that this policy has a worst-case performance guarantee of two i.e. for each instance of the problem the expected cost of the policy is at most twice the expected cost of an optimal policy. Recall the assumption discussed in §2 that the cost parameters imply no motivation for holding inventory or backorders. This implies that without loss of generality for each t = 1 T ct = 0 and ht pt 0. Moreover we first describe the algorithm its analysis and several extensions under the latter assumption. Then in §4.2 we discuss in detail the generality of this assumption. The dual-balancing policy presented in this paper is based on a balancing idea similar to the one used in Levi et al. (2007) for the uncapacitated model. That dual- balancing policy balances in each period s and conditioned on the observed information set s the expected marginal holding cost of the units ordered in the period against the expected (traditional) backlogging cost in period s + L a lead time ahead of s. However it is readily seen that this approach does not work in the case where there is a capacity constraint on the size of the order in period s. For one the order size qs that balances these two costs might not be reachable when qs >us. In turn we consider the forced marginal backlogging cost accounting and the corresponding cost it associates with period s as described in §3 above. Conditioned on the observed information set s we now balance the expected marginal holding cost of the units ordered in period s against the expected arginal backlogging costs associated with period s. We will use the superscript B to refer to the dual-balancing policy. For each period s = 1 T − L conditioning on the observed information set s let ls B qs B be the expected holding cost incurred over sT by the units ordered by the dual-balancing policy in period s. That is lB qB = E HB qB · . In §3 we have defined ss sss T HB QB ++ =− D − XB (recall that we sj=s+Lhjt sj s B BB assume c = 0). In addition let q · be ss = E ss s the expected backlogging cost associated with period s by the forced marginal backlogging cost-accounting scheme described above again conditioned on the observed information set s. Recall that in §3 we have defined s B = T WB where t=s+L pt st + WB = min QB D − XB + QB + st sst ssuj j∈ st + = D − XB + QB + uj st ss j∈ st−L + − D − XB + st suj j∈ st−L If we condition on s the inventory position at the beginning of period s xs B is known deterministically. That is B BB it is clear that lB q and q are both indeed functions ss ss of qs B the number of units ordered in period s. We first discuss the case where the orders are allowed to be fractional. This implies that the functions lBB BB q and q are continuous. In each period ss ss s = 1 T − L given the observed information set s the dual-balancing policy will order qB = q u units s ss such that the expected marginal ordering and holding cost incurred by these units over sT is equal to the expected forced marginal backlogging cost associated with period s. In other words we order q units such that lB q = s ss E HB q · = B q = EB q · . Next we show sss ss sss that this policy is well defined. It is readily verified that lBB B q is a convex increasing function of q that is equal to ss s zero for qs B = 0 and goes to as qs B goes to . Similarly INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. one can verify that s B qs B is a decreasing convex function of qs B that has a nonnegative value at qs B = 0 and that is equal to zero for qs B = us (in this case there is no unused slack capacity at s and q s B = 0). Our assumption that these functions are continuous implies that qs as defined above always exists. Computationally qs is the minimizer of the function B BBB gs q = max &lB q q which is a convex func s ss ss tion of qs B because it is the maximum of two convex functions. Hence in each period s we need to solve a convex minimization problem of a single variable. In particular if for each j s Ds j is distributed according to any of those distributions that are commonly used in inventory theory then it is extremely easy to evaluate the functions ls B qs B and s B qs B . More generally the complexity of the algorithm is of order T (i.e. number of time periods) times the complexity of solving the single variable convex minimization defined above. The complexity of this minimization problem can vary depending on the level of information we assume on the demand distributions and their characteristics. In all of the common scenarios there exist straightforward methods to solve this problem efficiently. In particular qs is determined by the intersection of two monotone convex functions which suggests that bisection methods can be effective in computing qs . We note that the dual-balancing policy is not a state-dependent base-stock policy. However it can be computed in an online anner i.e. computing the policy action in period s does not require any knowledge of the future decisions to be made in the next periods. Moreover unlike the myopic policy the dual-balancing policy does use available information about long-term future demands. 4.1. Analysis Next we show that for each instance of the problem the expected cost of the dual-balancing policy described above is at most twice the expected cost of an optimal policy. We will use the marginal cost-accounting scheme described in §3 and amortize the period cost of the dual-balancing policy with the cost of the optimal policy. Using the marginal cost-accounting scheme discussed in §3 the expected cost of the dual-balancing policy can T −L be expressed as E B = E HB + B . For each t=1 tt t = 1 T − L let Zt be the rando balanced cost by the dual-balancing policy in period t i.e. Z = E HB · . t tt Note that Zt is a function of the observed information set in period t. In the next lemma we obtain an expression for the expected cost of the dual-balancing policy using the Zt variables. The proof is identical to the proof of Lemma 4.1 in Levi et al. (2007). Lemma 1. The expected c st f the dual-balancing p licy is equal t twice the expected sum f the Zt variables, i.e., T −L = 2E Z . E Bt=1 t In the next two lemmas we show that the cost of OPT can be amortized against some of the cost of the dual- balancing policy. In particular they imply that the expected T −L cost of OPT is at least E Z . For each realization t=1 t of the demands D1 DT let H be the set of periods t = 1 T − L in which the optimal policy had inventory position higher than that of the dual-balancing policy i.e. 0 then there is nothing to prove. Assume that such a period s exists and let sl and se be the latest and the earliest periods in the set &s ∈ s t −L wst B > 0 respectively (it is possible that sl =se). We note again that here we abuse our notation and consider the set as the realized set of periods according to the specific realization T . In particular se and sl are the respective realizations of random variables Se and Sl. We have already seen (in the discussion in §3) that for each s ∈ se sl we BBB B have wst = qs and wse t ds t − xse +qse + j∈ se t−L uj . Indeed e −niOPT OPT OPT dt t =dt − y + qj −dsl t sl j∈ sl t−L dsl t − ys B l + uj j∈ sl t−L =d − y B + q B −d + uj ss j∈ ssl j∈ sl t−L slt e jesl e BB B =d − x +q + uj + q setss j ee st−Lj∈ s sl ee B B w w st st j∈ssl j∈s sl ∩ ee The first equality is again based on the fact that for each feasible policy and for each s t wehaveNIt =Ys + j∈ st−L Qj −Ds t applied to OPT and periods sl t −L. The first inequality follows from the assumption that OPT B sl ∈ and so y y and from the capacity constraints sl sl that imply qOPT uj . The second equality follows from j the fact that (for each s s ) Ys =Ys + j∈ ss Qj −Ds s applied to the dual-balancing policy and periods se sl. The last equality is achieved by adding and subtracting B j∈ sq and from the fact that uj =Qj +Qj . The proof esl j then follows. As a corollary of Lemmas 1 2 and 3 we get the following theorem. heorem 2. The dual-balancing p licy has a w rst-case perf rmance guarantee f tw , i.e., f r each instance f the capacitated peri dic-review st chastic invent ry c ntr l pr blem, the expected c st f the dual-balancing p licy is at m st twice the expected c st f an ptimal s luti n, i.e., E B 2E OPT . From Lemma 1 we know that the expected cost of the dual-balancing policy is equal to twice the expected cost of T −L the sum of the Z variables i.e. E B = E Z . tt=1 t From Lemmas 2 and 3 we know that with probability one the cost of OPT is at least as much as the holding cost incurred by units ordered by the dual-balancing policy in periods t ∈ H plus the forced marginal backlogging cost of the dual-balancing policy that is associated with periods t ∈ . In other words with probability one HOPT + OPT B t∈ H Ht B + t∈ t . Again using conditional expectations and the definition of Zt this implies that indeed E OPT HB B E t + t t∈ Ht∈ B = E Ht B · t ∈ H + t · t ∈ t = EE HB · t ∈ H + B · t ∈ · t tt t = E t ∈ H + t ∈ Zt = E Zt tt We note that if the optimal policy is deterministic (i.e. it makes deterministic decisions in each period t given the observed information set t) then if we condition on t B OPT then yt and yt are known deterministically and so are the indicators t ∈ H and t ∈ . If the optimal policy is random then the same arguments above still work. We now need to condition not only on t but also on the decisions made by the policies. Because the inventory control policy does not have any effect on the evolution of the demand the arguments above are still valid. This concludes the proof of the theorem. We note that the examples discussed in Levi et al. (2007) show that the above analysis is tight. However the analysis hints that in a typical scenario the performance would be significantly better. Hurley et al. (2006) present a thorough empirical analysis of the typical performance of dual-balancing policies in uncapacitated models. In §6 we present empirical results that confirm that this phenomenon extends to the capacitated case. Finally we note that the dual-balancing policies and the worst-case analysis can be extended to models where the capacities in each period are generated by some exogenous random process and the exact capacity available in period t is observed only at the beginning of the period. Thus the dual-balancing policies provide a worst-case guarantee of two for this important extension as well. In this case the expectations of the marginal backlogging costs are taken with respect to both the random future demands and random future capacities. In Online Appendix C we consider two extensions of the dual-balancing policy and the worst-case analysis. Specifically we discuss the extensions to models where orders must be integral and the demands are integer-valued random variables and to models with stochastic lead times under the no-order-crossing assumption. 4.2. C st Transf rmati n In this section we discuss in detail the cost transformation that enables us to assume without loss of generality that INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. for each period t= 1 T wehave ct = 0 and ht pt 0. Consider any instance of the problem with cost parameters that imply no speculative motivation for holding inventory or backorders (as discussed in §2). Following Levi et al. (2007) we use a simple transformation of the cost parameters to construct an equivalent instance with the property that for each period t= 1 T wehave ct = 0 and ht pt 0. More specifically the modified instance has the same set of optimal policies. Applying the dual-balancing policy to that instance we obtain a policy that is different from the original dual-balancing policy and which also has a performance guarantee of at most two with respect to the original problem. We shall show that this cost transformation can improve the performance guarantee of the dual-balancing policy in cases where the ordering cost is the dominant part of the overall cost. In practice this is often the case. We now describe the transformation for the case with no lead time (L= 0) and = 1; the extension to the case of arbitrary lead time is straightforward. Recall that any feasible policy P satisfies for each t= 1 T Qt = NIt− NIt−1 +Dt (for ease of notation we omit the superscript P). Using these equations we can express the ordering cost in each period tas c NI −NIt−1 +D . Now replace NI with ttt t NI+ t − NI− t its respective positive and negative parts. This leads to the following transformation of cost param ˆ eters. We let cˆ= 0 h = h + c − c c = 0 t t ttt+1 T+1 and pˆ= p − c + ct+1. Note that the assumptions on the t tt cost parameters ct ht and pt discussed in §2 and in particular the assumption that there is no speculative motivation to hold inventory or backorders imply that hˆ t and bˆ t above are nonnegative (t= 1 T). Observe that the parameters hˆ t and bˆ t will still be nonnegative even if the parameters ct ht and pt are negative and as long as the above assumption holds. Moreover this enables us to incorporate into the model a negative salvage cost at the end of the planning horizon (after the cost transformation we will have nonnegative cost parameters). It is readily verified that the induced problem is equivalent to the original one. More specifically for each realization of the demands the cost of each feasible policy P in the modified input decreases T by exactly d (compared to its cost in the original t=1 ct t input). Therefore any optimal policy for the modified input is also optimal for the original input. Now apply the dual-balancing policy to the modified problem. We have seen that the assumptions on ct ht and p ensure that hˆ and pˆ are nonnegative and hence the t tt analysis presented above is valid. Let opt and opt be the optimal expected cost of the original and modified inputs T respectively. Clearly opt = opt + E Dt . Now the t=1 ct expected cost of the dual-balancing policy in the modified input is at most 2opt. Its cost in the original input is then T T at most 2opt + E D = 2opt − E D . This t=1 ctt t=1 ct t T implies that if E t=1 ctDt is a large fraction of opt then the performance guarantee of the expected cost of the dual- balancing policy might be significantly better than two. T For example if E t=1 ctDt 0 5 opt then we can conclude that the expected cost of the dual-balancing policy is at most 1 5 opt. It is indeed the case in many real-life problems that a major fraction of the total cost is due to the ordering cost. The intuition of the above transformation is T that D is a cost that any feasible policy must pay. t=1 ct t As a result we treat it as an invariant in the cost of any policy and apply the approximation algorithm to the rest of the cost. In the case where we have a lead time L we use the equations Qt = NIt+L − NIt+L−1 + Dt+L for each t = 1 T − L to get the same cost transformation. The transformation for >1 is also straightforward. Also it is not hard to see that the cost transformation can be modified to remove say 1% of the per-unit ordering costs where 0 <1<100. This leads to a continuum of dual-balancing policies all of which are two-approximations. 5. Impr ved P licy and B unds n the Optimal Invent ry Levels In this section we consider two semimyopic (modified) base-stock policies that are easy to compute in an online manner and provide respectively lower bounds and upper bounds on the inventory levels of an optimal policy yt OPT in each period t= 1 T. We believe that these bounds can be used effectively to improve existing algorithms for computing inventory control policies for the capacitated model discussed in this paper and other capacitated stochastic inventory models. Moreover as in Hurley et al. (2006) we shall show that these policies provide bounds that are strong in the following sense: each policy that for some period t and some state t has inventory level outside the range defined by the respective lower and upper bounds can be improved. In particular there is another (modified) policy that in period tand state t admits an inventory level within the specified range with expected cost no greater than the expected cost of the original policy. In other words any policy that violates these respective bounds is dominated by another policy. We then follow Hurley et al. (2006) and construct an i proved dual-balancing policy that incorporates these bounds. This policy also has a performance guarantee of two and as the computational study for the uncapacitated model in Hurley et al. (2006) suggests we expect that it will have a better typical performance. The policies we consider are called lower- yopic (denoted by LM) and upper- yopic (denoted by UM) respectively. In the lower-myopic policy in each period s conditioning on the observed information set s we minimize the su of the expected marginal holding cost of the units ordered in that period and the traditional expected backlogging costs a lead time ahead. That is in each period s we minimize LM = lLM + gs qs s qs + E ps+L Ds s+L − xs + qs · s INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. under the constraint qs us. This is a convex function of qs. This policy was first proposed for the uncapacitated model by Levi et al. (2007) who called it the ini izing policy. They have shown that this is a base-stock policy that provides lower bounds on the optimal base-stock levels. However in the capacitated model it is possible that the actual minimizer will not be attainable. In this case we order up to capacity and this provides a modified base-stock policy. In this paper we extend and generalize their proof for the capacitated model. In the upper-myopic policy in each period s again conditioning on s we minimize the sum of the expected period holding cost and the expected forced marginal backlogging. Thus we minimize UM UM + gs qs = s qs + E hs+L xs + qs − Ds s+L · s subject to 0 qs us which is also convex in qs. We shall show that this policy provides upper bounds on the inventory levels of an optimal policy. By arguments similar to the ones used by Levi et al. (2007) it can be shown that this gives rise to yet another modified base-stock policy. UM1 UM2 1 (In particular gs q − gs q depends only on y = xs + q1 and y2 = xs + q2.) To the best of our knowledge this is a new way for deriving upper bounds on the inventory levels of an optimal policy in the capacitated model. We note that it is not clear whether the classical myopic policy where we minimize the expected period cost provides any bounds for capacitated models. Another similar open question is how the policy that in each period minimizes the sum of the expected marginal holding cost and expected forced marginal backlogging cost is related to an optimal policy. Let Yt LM and Yt UM be the respective inventory position (after orders are placed) of the lower-myopic and the upper- myopic policies in period t = 1 T. Specifically we assume that Yt LM is the smallest minimizer of the corresponding period problem being solved (see above) and that YUM t is the largest minimizer of the corresponding period problem. Note that the inventory position levels depend on the specific state ( t xt) but for ease of notation we omit the indication of the state. The two semimyopic policies described above can be implemented in an online manner i.e. regardless of the action control in future periods. We shall show that for each evolution T these two policies provide lower and upper bounds on the inventory levels of YOPT YUM any optimal policy i.e. Yt LM tt with probability one for each t= 1 T. Moreover we shall show that YUM each nondominated policy P must have YLM YP t tt for each t= 1 T. The next two lemmas show that each policy P that has for some period s and state s inventory position ys P LM UM ys ys can be strictly improved by a modified pol- P LM UM icy P with y ∈ yy and expected cost at most the s ss expected cost of P. For the sake of simplicity we consider a model with no lead time (the extensions to the case with L>0 are straightforward). Lemma 4. C nsider a feasible p licy P, and supp se that f r s me peri d sand inf rmati n set s, we have ys P ys UM . Further assume that s is the earliest such peri d. Then, the p licy P that f ll ws P until peri d s− 1, then rders up t ys UM in peri d s and again imitates P ver the interval sT , has expected c st n larger than the expected c st f P. By arguments identical to the ones in Lemma 4 we conclude that P and P incur the same cost over 1 s and that they have the same inventory position x yUM yUM) we sss s sss fix this decision by instead increasing the order up to ys LM or decreasing it down to ys UM respectively. It can be readily verified that for each evolution T and each period s we LM IB UM have y y y . s ss We next prove the following theorem. heorem 3. The impr ved dual-balancing p licy has a perf rmance guarantee f tw . Observe that in the improved dual-balancing policy it is no longer true that in each period t the expected marginal holding cost is equal to the expected forced marginal backlogging cost. Now let Zt be the maximum among the expected marginal holding cost and expected forced marginal backlogging cost i.e. Z = max E HIB QIB · t ttt IBIB IB E t Qt · t where Qt is the order quantity placed by the improved dual-balancing policy in period s. (As already mentioned Qt IB can be either larger or smaller than the balancing quantity Qt.) Similar to Lemma 1 we now conclude that E IB 2 tE Zt . Next we modify the definition of the sets H and in §4. The set H will consist of periods t= 1 T− L YLM YOPT ods such that (i) YLM and YIB ;or t tt tt YUM = t ttt YLM IB and Q YUM policy orders Q Q (i.e. XIB + Q ) and YIB . However tt ttt tt we again get a contradiction because by definition t∈ H (see cases (ii) and (iii) in the definition of H above). This concludes the proof of the lemma. 6. C mputati nal Experiments As we mentioned in the introduction due to state space explosion the corresponding inventory control models are very difficult from a computational perspective. Consequently we study the typical performance of the balancing policies in two settings. In the first setting the optimal solution of the capacitated inventory system is easily computed but there is no evolution of forecasts (i.e. demands are independent over time). This enables us to see how close to optimal the balancing policy is in at least one setting. The second experiment is more realistic in that the demand and forecast evolution processes are governed by the multiplicative MMFE model. In fact these are the settings in which balancing policies are most attractive because optimal policies are inaccessible and no provably good heuristics or even reasonable lower bounds are available. As a result we benchmark the performance of the balancing policies using the myopic and the other semimyopic policies developed in this paper in §5. In these experiments the balancing policies were very robust. For the model with independent demands the dual-balancing policy came within 11% of the optimal cost on average within 17% of optimal in 95% of the trials and never exceeded the optimal cost by more than 25%. Moreover the balancing policies outperform the myopic policy by 49% in the first experiment and by 27% in the second on average. (In many scenarios the balancing policies improve upon myopic by more than 50%.) This indicates that the typical performance of the balancing policies is significantly better than the worst-case guarantees. 6.1. Experiments with Translated-Mass Exp nential Demand Distributi ns In this experiment we consider infinite-horizon problems with i.i.d. demand i.e. the distribution of Dt · s INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. Figure 4. Sensitivity of performance to capacity backorder costs and demand variance. 1.00 1.05 1.15 1.10 3.25 2.75 2.25 1.75 1.25 1.20 1 32 1 16 1 8 1 4 1 2 1 2 4 816 32 64 Myopic/opt vs. capacity – 1 Bal/opt vs. backorder$/holding$ Bal/opt vs. capacity – 1 Myopic/opt vs. demand variance Myopic/opt vs. backorder$/holding$ Bal/opt vs. demand variance is independent of both s and t. We assume that Dt has a translated-mass exponential distribution meaning that P Dt >x = 1if xx − x−a = qe + where 0 q 1 3> 0 a 0 and a· 1 − q = 0. If q= 1 then Dt has an exponential distribution translated to the right by aunits. If q<1 then a= 0 Dt = 0 with probability 1−q and with probability q Dt follows an exponential distribution. For every positive mean and variance there is a unique translated-mass exponential distribution. For infinite-horizon problems with translated-mass exponential demand a stationary order-up-to policy is optimal. The optimal policy and its cost are easily obtained using the following observation: for translated-mass exponential demand the lower and upper bounds in Theorem 2 of Glasserman (1997) coincide. The demand Dt has mean one. We start with a base case in which Dt has variance one the capacity is 1.5 and the backorder cost per day is eight times larger than the holding cost. Figure 4 illustrates what happens when we fix two of these parameters and vary the third one. On the vertical axis we show the ratio of the cost of the balancing policy to the optimal cost and the ratio of the myopic cost to the optimal cost. Note that the scale on the vertical axis is not uniform. For the solid lines the horizontal axis displays the excess capacity (i.e. the capacity minus the mean demand or “capacity − 1”). For the dashed lines the horizontal ordinate is the ratio of the backorder cost per day to the holding cost. For the dotted lines the horizontal ordinate is the variance of the demand. Figure 5. Histogram of cost as a fraction of optimal cost. 700 600 500 400 300 200 100 0 Bal$/opt$ Myopic$/opt$ 12345 In addition we randomly generated 1 000 problem instances using a mean demand of one. The capacity the backorder-to-holding-cost ratio and the standard deviation of the demand are all randomly generated from translated beta distributions. For the capacity the distribution has minimum maximum mean and standard deviation equal to 1 05 3 3 1 61 0 32 . For the backorder-to-holding-cost ratio and the standard deviation of the demand the corresponding values are 11012600 1443 and 01 36 098 051 .Thecomputations were done using JAVA on a standard PC and computing the balancing decision in each period took 0.00015 seconds on average. Figure 5 shows histograms of the ratio of the balancing policy’s cost and the optimal cost and the ratio of the myopic policy’s cost and the optimal cost. Figure 6 is a restricted view of Figure 5 with a finer grid limited to the neighborhood around one. The ratio of the balancing policy’s cost to the optimal cost is 1.11 on average with a standard deviation of 0.049 a 95th percentile of 1.17 and a maximum of 1.58. For the myopic the corresponding ratio has mean 1.60 standard deviation 0.92 95th percentile 3.38 and maximum 8.61. This indicates that the Figure 6. Detailed histogram of cost as a fraction of optimal cost. 0 100 50 150 200 250 B l$/opt$ Myopic$/opt$ 1.00 1.05 1.10 1.15 1.20 1.25 balancing policy is very robust compared to the myopic Figure 7. Performance of four policies relative to policy. myopic under forecast evolution. 6.2. Experiments with Multiplicative-MMFE-Based 60 Demand and F recast Ev luti n Lower myopic Upper myopic Improved balancing Balancing 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 This test uses the experimental design of Hurley et al. (2006) in which the uncapacitated version of the balancing algorithm was tested. In all of our experiments the holding and backorder costs are ht = 1 and pt = 10. A horizon of length T = 40 was used and forecasts of demand evolve according to the multiplicative MMFE model (Graves et al. 1986 Heath and Jackson 1994). The mean demand per Number of scenarios 50 40 30 20 INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. period averaged over the 40 periods in the time horizon is 400 in all cases. The capacity is 460 units per period. The experimental design consists of 82 scenarios. For each scenario we tested 1 000 random problem instances. The scenarios were designed to capture a variety of settings and characteristics. Demand and forecast variability can be either high or low and lead times can be short or long. Some scenarios study different types of seasonality in the demand. Others consider product launches and product phaseouts. Some scenarios account for the fact that many forecasting systems generate accurate forecasts that extend many time periods into the future whereas other systems can only forecast accurately in the near term. In addition shifts in forecasts can demonstrate either no correlation positive correlation or negative correlation. The scenarios are described in detail in Online Appendix B and in Hurley et al. (2006). We study five policies: myopic lower myopic upper myopic dual-balancing and improved balancing. For each of the 82 scenarios constructed and for each policy we examine the average per-period cost of the policy over 1 000 runs. Note that because we consider a complex environment and relatively long horizon (T = 40) it is not possible to compute the optimal expected cost. Moreover to the best of our knowledge it is not even known how to compute reasonable lower bounds in this setting. Instead we use as our benchmark the myopic policy and the other semimyopic policies discussed in §5. The policies were computed using MATLAB on a standard PC. The average times to compute the period ordering decisions were 0.0031 0.0738 0.0412 seconds for the myopic the minimizing and balancing policies respectively. In Figure 7 we provide histograms of the ratio of the cost of each policy divided by the cost of myopic. Both dual-balancing and improved balancing outperform myopic in every one of the 82 scenarios. Relative to myopic they provide an average saving of 27.2% and 32.4% respectively. Lower myopic is very close to myopic (the ratio is usually close to one) and is sometimes worse than myopic. The trend is not unexpected because myopic often under- orders in capacitated systems and lower myopic always orders less than myopic. Upper myopic is virtually identical to improved dual-balancing which truncates the balancing 10 0 Average per period cost relative to myopic order quantities using the order-up-to levels of upper and lower myopic. In all of our computational experiments the performance of the balancing policy is both strong and consistent. Improved balancing is better than balancing. 7. Electr nic C mpani n An electronic companion to this paper is available as part of the online version that can be found at http://or.journal. informs.org/. Ackn wledgments The authors thank the associate editor and the anonymous referees for numerous comments that improved the content and exposition of this paper. This research was partially conducted while the first and the fourth authors were Ph.D. students in the School of Operations Research and Industrial Engineering at Cornell University. The research of the first author was partially supported by NSFgrants CCR9912422 CCR-0430682 and DMS-0732175 and AFOSR grant FA9550-08-1-0369. The research of the second author was partially supported by NSFgrants DMI-0075627 and DMI-0500263 and the Querétaro Campus of the Instituto Tecnológico y de Estudios Superiores de Monterrey. The research of the third author was partially supported by NSF grants CCR-9912422 CCR-0430682 DMI-0500263 and CCR-0635121. The research of the fourth author was partially supported by NSFgrant DMI-0500263. References Aviv Y. A. Federgruen. 1997. Stochastic inventory models with lim ited production capacity and periodically varying parameters. Probab. Engrg. Infor ational Sci. 11 107–135. Axsäter S. 1990. Simple solution procedures for a class of two-echelon inventory problems. Oper. Res. 38 64–69. Chan E. W. M. 1999. Markov chain models for multi-echelon supply chains. Ph.D. thesis School of Operations Research and Industrial Engineering Cornell University Ithaca NY. INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. Chen F. J. S. Song. 2001. Optimal policies for multi-echelon inventory problems with Markov-modulated demand. Oper. Res. 49(2) 226–234. Dong L. H. L. Lee. 2003. Optimal policies and approximations for a serial multiechelon inventory system with time-correlated demand. Oper. Res. 51(6) 969–980. Federgruen A. P. Zipkin. 1986a. An inventory model with limited production capacity and uncertain demands I: The average-cost criterion. Math. Oper. Res. 11(2) 193–207. Federgruen A. P. Zipkin. 1986b. An inventory model with limited production capacity and uncertain demands II: The discounted-cost criterion. Math. Oper. Res. 11(2) 208–215. Glasserman P. 1997. Bounds and asymptotics for planning critical safety stocks. Oper. Res. 45 224–257. Graves S. C. H. Meal S. Dasu Y. Qin. 1986. Two-stage production planning in a dynamic environment. S. Axsäter C. Schneeweiss E. Silver eds. Multi-Stage Production Planning and Control. Lecture Notes in Economics and Mathematical Systems. Springer-Verlag New York. Heath D. C. P. L. Jackson. 1994. Modeling the evolution of demand forecasts with application to safety stock analysis in production distribution systems. IIE Trans. 26(3) 17–30. Hurley G. P. Jackson R. Levi R. O. Roundy D. B. Shmoys. 2006. New policies for stochastic inventory control models—A theoretical and computational study. Working paper Cornell University Ithaca NY. Ignall E. A. F. Veinott. 1969. Optimality of myopic inventory policies for several substitute products. Manage ent Sci. 15 284–304. Iida T. P. Zipkin. 2006. Approximate solutions of a dynamic forecast- inventory model. Manufacturing Service Oper. Manage ent 8(4) 407–425. Janakiraman G. J. A. Muckstadt. 2003. A decomposition approach to a class of capacitated serial systems. Technical report TR1360 School of Operations Research and Industrial Engineering Cornell University Ithaca NY. Kapuscinski R. S. Tayur. 1998. A capacitated production-inventory model with periodic demand. Oper. Res. 46(6) 899–911. Katircioglu K. D. Atkins. 1996. New optimal policies for a unit demand inventory problem. Unpublished manuscript. Levi R. M. Pál R. O. Roundy D. B. Shmoys. 2007. Approximation algorithms for stochastic inventory control models. Math. Oper. Res. 32 284–302. Lu X. J. S. Song A. C. Regan. 2006. Inventory planning with forecast updates: Approximate solutions and cost error bounds. Oper. Res. 54 1079–1097. Metters R. 1997. Production planning with stochastic seasonal demand and capacitated production. IIE Trans. 29 1017–1029. Muharremoglu A. J. N. Tsitsiklis. 2008. A single-unit decomposition approach to multi-echelon inventory systems. Oper. Res. Forthcoming. Özalp Ö. W. Wei. 2004. Inventory control with limited capacity and advance demand information Oper. Res. 52(6) 988–1000. Roundy R. O. J. A. Muckstadt. 2000. Heuristic computation of periodic-review base stock inventory policies. Manage ent Sci. 46(1) 104–109. Song J. S. P. Zipkin. 1993. Inventory control in a fluctuating demand environment. Oper. Res. 41(2) 351–370. Tayur S. 1992. Computing the optimal policy for capacitated inventory models. Stochastic Models 9 585–598. Veinott A. F. 1965. Optimal policy for a multi-product dynamic non- stationary inventory problem. Manage ent Sci. 12 206–222. Zipkin P. H. 2000. Foundations of Inventory Manage ent. McGraw-Hill New York.