OPERAT ONS RESEARCH 

inform ® 
Vol. 56 No. 5 September–October 2008 pp. 1184–1199 
issn 0030-364X · eissn 1526-5463 · 08 · 5605 · 1184 

doi 10.1287/opre.1080.0580 
© 2008 INFORMS 

INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

Approxi ation Algorith s for Capacitated 
Stochastic Inventory Control Models 

Retsef Levi 

Sloan School of Management Massachusetts Institute of Technology Cambridge Massachusetts 02139 
retsef@mit.edu 

Robin O. Roundy 

Mission Church of Jesus Christ of Latter Day Saints Apartado Aereo Barranquilla Colombia 
robin@orie.cornell.edu 

David B. Sh oys 

School of Operations Research and Information Engineering and Department of Computer Science Cornell University 
Ithaca New York 14853 shmoys@cs.cornell.edu 

Van Anh Truong 

Fixed Income Global Modelling and Analytics Group Credit Suisse Securities 
New York New York 10010 van.anh.truong@credit-suisse.com 

We develop the first algorithmic approach to compute provably good ordering policies for a multiperiod capacitated 
stochastic inventory system facing stochastic nonstationary and correlated demands that evolve over time. Our approach 
is computationally efficient and guaranteed to produce a policy with total expected cost no more than twice that of an 
optimal policy. As part of our computational approach we propose a novel scheme to account for backlogging costs in a 
capacitated multiperiod environment. Our cost-accounting scheme called the forced arginal backlogging cost-accounting 
sche e is significantly different from the period-by-period accounting approach to backlogging costs used in dynamic 
programming; it captures the long-term impact of a decision on system performance in the presence of capacity constraints. 
In the likely event that the per-unit order costs are large compared to the holding and backlogging costs a transformation 
of cost parameters yields a significantly improved guarantee. We also introduce new semimyopic policies based on our 
new cost-accounting scheme to derive bounds on the optimal base-stock levels. We show that these bounds can be used to 
effectively improve any policy. Finally empirical evidence is presented that indicates that the typical performance of this 
approach is significantly stronger than these worst-case guarantees. 

Subject classifications: stochastic inventory control; heuristics; approximation algorithms. 
Area of review: Manufacturing Service and Supply Chain Operations. 
History: Received August 2005; revisions received July 2006 December 2006 May 2007 September 2007 

October 2007; accepted October 2007. 

1. Intr ducti n 

The periodic-review capacitated inventory control problem 
for systems facing stochastic nonstationary (time-dependent) 
demands that are correlated and evolve over time is 
an important classical problem that is widely recognized to 
be computationally challenging. We develop a new algorithmic 
approach to compute the order quantity for such 
a system. We build on the work of Levi et al. (2007) 
who used a marginal holding cost accounting scheme and 
cost-balancing techniques to derive the first policies with 
worst-case performance guarantees for uncapacitated models. 
In this paper we introduce a novel marginal backlogging 
cost-accounting scheme that in combination with 
their techniques leads to analogous results for the much-
harder capacitated model. We believe that our new cost-
accounting scheme will have applications in many other 
settings. Our algorithm is guaranteed to compute a solution 
of total expected cost of no more than twice that of an 
optimal policy for any instance of the problem. The algorithm 
is computationally efficient and implementable without 
having to exhaustively enumerate future scenarios and 
corresponding future decisions. In particular the decision 
made in the current period is unaffected by any future decision. 
Thus it can be implemented efficiently even in the 
presence of complex demand structures. 

Specifically we consider single-item models with one 
location and a finite planning horizon of T discrete time 
periods. The demands over the T periods are random variables 
that can be nonstationary and correlated. The costs 
consist of a per-unit time-dependent ordering cost a per-
unit holding cost for carrying excess inventory from period 
to period and a per-unit backlogging cost which is a 
penalty incurred in each period for each unit of unsatisfied 
demand (where all shortages are fully backlogged). There 
is a time-dependent capacity constraint on the number of 

1184 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

units ordered in each period and a lead time between the 
time that an order is placed and the time that it actually 
arrives. The capacity constraints and lead times may be 
stochastic. 

Capacitated problems are inherently more difficult computationally 
compared to their uncapacitated counterparts. 
The constraint on capacity makes future costs heavily 
dependent on current decisions. Myopic policies which do 
not consider the impact of a decision made in the current 
period on the costs incurred in future periods seem to perform 
well for some scenarios in uncapacitated systems and 
are even optimal in some settings (see Veinott 1965 Ignall 
and Veinott 1969 Iida and Zipkin 2006 Lu et al. 2006). 
However when applied to capacitated problems they usually 
perform very poorly because they do not consider possible 
capacity limitations in future periods. 

In this work we introduce a look-ahead backlogging 
cost-accounting scheme called the forced arginal backlogging 
cost-accounting sche e to capture the long-term 
impact of current decisions on future costs in the presence 
of capacity constraints. Our new cost-accounting scheme 
assigns to the decision in each period all of the expected 
backorder costs that once this decision is made become 
inevitable; that is they are unaffected by any decision 
made in future periods and are dependent only on future 
demands. The forced marginal backlogging cost reduces to 
the traditional backlogging cost when the capacity is infinite; 
thus it is a generalization of the traditional backlogging 
cost. Finally as discussed in §3.1 it is straightforward 
to compute in most common scenarios. 

The key feature distinguishing the algorithms presented 
in this paper from those previously studied for capacitated 
systems is the treatment of correlation in demand across 
time as well as nonstationarity. Moreover we allow observations 
of the past to change demand forecasts for the 
future. Our model also captures other important characteristics 
of a nonstationary environment: the parameters are 
fully time dependent including cost parameters and system 
capacity. An important application of demand correlation 
and nonstationarity is in the use of dynamic demand 
forecasts. These forecasts and the way they evolve over 
time provide vital information that can be used to find 
effective inventory control policies in dynamic and highly 
volatile demand environments. The assumptions that we 
make on the demand distributions in this work are mild 
enough to generalize all of the currently known approaches 
in the literature to model correlation and nonstationarity of 
demand over time. These include classical approaches such 
as the artingale odel of forecast evolution (MMFE) 
exogenous Markovian demand time series order-one autoregressive 
demand and random walks. For an overview of 
the different approaches and models and for relevant references 
we refer the reader to Iida and Zipkin (2006) and 
Dong and Lee (2003) and Özer and Wei (2004). 

High correlation between demands across different periods 
in nonstationary and dynamic environments presents 
a considerable challenge to computing or even approximating 
optimal inventory control policies. The dominant 
paradigm in almost all of the existing literature has been 
to formulate multiperiod capacitated models using dynamic 
programming. The optimization problem is defined recursively 
over time by using subproblems for each possible 
state of the system. The state usually consists of a given 
time period the level of inventory at the beginning of 
the period the resulting conditional distribution of future 
demands over the rest of the horizon and possibly more 
information that is available by that period. For each subproblem 
an optimal solution is computed to minimize the 
expected overall discounted cost from the current point to 
the end of the horizon. This framework has turned out 
to be very effective in characterizing the structure of the 
optimal policy of the overall system. Assuming stationary 
linear costs and independent and identically distributed 
(i.i.d.) demands Federgruen and Zipkin (1986a b) showed 
that a modified base-stock policy is optimal under infinite-
horizon average cost and discounted cost criteria. They 
established the existence of a target inventory level such 
that the optimal policy aims to keep inventory levels as 
close as possible to that target. When the inventory level 
at the beginning of the period is above the target level the 
optimal policy does not order. When the inventory level 
at the beginning of the period is below the target level 
it might not be possible to order up to the target level 
because of the capacity constraint. In this case the order 
placed would be up to capacity. Tayur (1992) Kapuscinski 
and Tayur (1998) and Aviv and Federgruen (1997) derived 
the optimal policy in the same settings but for independent 
cyclical demands. More recently Özer and Wei (2004) 
showed the optimality of modified base-stock policies in 
single-item periodic-reviewed capacitated inventory control 
models with advance demand information. 

Axsäter (1990) is the first to introduce the notion of 
atching between pairs of demand and supply units. 
Specifically he observes that in a distribution system with 
a single depot and multiple retailers a supply unit ordered 
by a retailer can be used to fill a corresponding demand 
unit following a certain order. He matches this pair of units 
and evaluates the corresponding expected holding cost. 
Katircioglu and Atkins (1996) have used this observation 
to analyze the optimal policies in unit demand inventory 
systems. For the uncapacitated periodic-review stochastic 
inventory control problem Muharremoglu and Tsitsiklis 
(2008) have proposed an alternative approach to the 
dynamic programming framework. They have observed that 
this problem can be decoupled into a series of unit supply-
de and subproble s where each subproblem corresponds 
to a single unit of supply and a single unit of demand 
that are matched. This novel approach enabled them to 
substantially simplify some of the dynamic-programmingbased 
proofs on the structure of optimal policies as well 
as to prove several important new structural results. In 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

particular they have established the optimality of state-
dependent base-stock policies for the uncapacitated model 
with general Markov-modulated demand. Using this unit 
decomposition they have also suggested new methods to 
compute the optimal policies. However their computational 
methods are essentially dynamic programming approaches 
applied to the unit subproblems; hence they suffer from 
similar problems in the presence of correlated and non-
stationary demand. Although our approach is very different 
from theirs we use some of their ideas as technical 
tools in some of the proofs. Janakiraman and Muckstadt 
(2003) have extended this approach to capacitated models 
and established the optimality of state-dependent modified 
base-stock policies for models with Markov-modulated 
demand. 

Unfortunately the rather simple forms of these policies 
do not always lead to efficient algorithms for computing the 
optimal policies. Complex demand structures such as the 
one we consider in this work cause the state space of the 
corresponding dynamic programs to explode (see Iida and 
Zipkin 2006 Dong and Lee 2003 for relevant discussions 
on the MMFE model). There does not exist at present nor 
is there likely to be developed an efficient algorithm to 
solve these dynamic programs to optimality even for the 
uncapacitated model. The difficulty comes from the fact 
that we need to solve “too many” subproblems a phenomenon 
known as the curse of di ensionality. To date 
computational procedures have been made tractable only 
under assumptions of simple demand structures. If the 
demands in different periods are independent the corresponding 
dynamic programs are relatively straightforward 
to solve. Dynamic programming can still be tractable for 
uncapacitated models with Markov-modulated demand but 
under rather strong assumptions on the structure and the 
size of the state space of the underlying Markov process 
(see for example Song and Zipkin 1993 Chen and Song 
2001). Tayur (1992) uses the shortfall distribution and the 
theory of storage processes to derive an efficient computational 
method for computing the optimal policy in the stationary 
cost i.i.d. demand average cost case. Roundy and 
Muckstadt (2000) showed how to obtain approximate base-
stock levels also for the stationary cost and i.i.d. demand 
case by approximating the distribution of the shortfall process. 
Kapuscinski and Tayur (1998) proposed a simulation-
based technique using infinitesimal perturbation analysis 
to compute the optimal policy for capacitated problems 
with independent cyclical demands. Finally Özer and Wei 
(2004) developed an exact dynamic programming approach 
for the model with advance demand information when the 
forecast horizon exceeds the lead time by two periods and 
so the state space is only two dimensional. 

There have been heuristic approaches to compute order 
quantities for capacitated problems. However we are aware 
of very few attempts to analyze the worst-case performance 
of heuristics and most bounds derived are dependent 
on the particular input (see for example Lu et al. 

2006). To the best of our knowledge there are no other 
policies for stochastic inventory control models with constant 
worst-case performance guarantees. Metters (1997) 
found heuristics for capacitated lost-sales problems with 
independent cyclical demands. Chan (1999) have considered 
heuristics for uncapacitated and capacitated multi-item 
models. Instead of solving the one-period problem (as in 
the myopic policy) they have added a penalty function to 
the one-period problem which they called the Q-function. 
This function accounts for the holding cost incurred by the 
inventory left at the end of the period over the entire horizon. 
Their look-ahead approach with respect to the holding 
cost is somewhat related to our approach although significantly 
different. 

As we have already mentioned this paper builds on the 
work of Levi et al. (2007). They give the first algorithms 
with a constant performance guarantee for the uncapacitated 
stochastic inventory control model with correlated nonstationary 
demands; specifically their algorithms always find 
solutions of total expected cost no more than twice the 
optimal. Their algorithms are based on two main ideas. 
First they construct a look-ahead holding-cost accounting 
scheme called the arginal holding-cost accounting 
sche e to compute the additional holding costs incurred 
by units ordered in the current period throughout the entire 
horizon. Second they use cost-balancing techniques in that 
in each period they order exactly to balance the following 
two opposing costs: the conditional expected marginal 
holding cost against the conditional expected period backlogging 
cost a lead time ahead. Their approach relies heavily 
on the ability of the system to order in each period a 
“balancing quantity” that equalizes the expected marginal 
holding cost and the expected backlogging cost in the 
period. In capacitated systems the approach fails because 
this balancing quantity might not be attainable due to 
capacity constraints. Our forced marginal backlogging cost-
accounting scheme is designed to remedy this problem by 
reassigning backlogging costs more appropriately to the 
decisions that create them enabling us to find a “balancing 
order quantity” for capacitated systems. Suppose that 
in the current period the order placed was not up to capacity; 
we wish to account for the potential backlogging cost 
in future periods incurred directly by the decision not to 
use the full available capacity. Assume temporarily that 
we order up to capacity in each one of the periods. Suppose 
now that in the current period we do not order up 
to capacity. Then the expected marginal backlogging cost 
associated with the current period is the overall increase 
in the expected backlogging cost over the entire horizon 
resulting from this decision. In this way our balancing policy 
for a capacitated system is able to achieve the same 
worst-case performance guarantee of two with surprisingly 
little additional computational effort. When applied to 
uncapacitated models the policies described in this paper 
are identical to the dual-balancing policies described by 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

Levi et al. (2007). Thus they can be viewed as generalizations 
of the original dual-balancing policies to capacitated 
inventory models. 

We also use the marginal holding and forced marginal 
backlogging cost-accounting schemes to derive additional 
semimyopic policies called the lower- yopic and upper- 
yopic policies. The policies provide lower and upper 
bounds on the optimal base-stock levels respectively that 
can be used in conjunction with any policy to achieve lower 
expected cost. 

Furthermore in §4.2 we show how to use standard cost 
transformations to improve the performance of the algorithms 
in many important settings (see also Levi et al. 
2007). These transformations yield a modified instance of 
the problem that is equivalent to the original one from 
an optimization perspective but models only holding and 
backlogging costs. If the per-unit ordering cost is constant 
over time then applying our algorithms to the modified 
instance yields an approximation algorithm with a worst-
case guarantee of two with respect to the holding and 
backlogging costs and which has the same total per-unit 
ordering cost as the optimal policy. More generally when 
the ordering costs are large the worst-case performance 
guarantee of the modified cost dual-balancing policy will 
be much better than two. 

In §6 we test the typical empirical performance of the 
balancing algorithms in two settings. We consider an inventory 
system that has i.i.d. demand (no correlations) and a 
demand distribution with an exponential tail because the 
optimal policy can be computed analytically. (The motivation 
is to test the balancing policies at least in one environment 
in which the optimal policy and cost are known.) 
However balancing policies are most attractive in scenarios 
with complex demand structures whereas optimal policies 
cannot be computed and no provable good heuristics or 
reasonable lower bounds are known. Thus we also consider 
the same set of test scenarios tested in Hurley et al. 
(2006) in which the uncapacitated versions of these algorithms 
were evaluated computationally. In these scenarios 
the demands and forecasts evolve according to the multiplicative 
MMFE model. Optimal policies are not computable 
and strong lower bounds on the optimal cost do 
not exist so we used the myopic policy as a benchmark for 
evaluating performance. The performance of the balancing 
policies is very robust. It was within 11% of optimal on 
average in the first test (always within 25%) and consistently 
improved upon myopic by over 27% on average 
and by over 50% in many scenarios. 

This paper is organized as follows. In §2 we present 
the mathematical formulation of the periodic-review capacitated 
stochastic inventory control problem. In §3 we 
describe the forced marginal backlogging cost-accounting 
scheme for the capacitated model. In §4 we describe 
the balancing policy and its worst-case analysis. We also 
extend the approximation results to the case of discrete 
demand and stochastic lead times (see Appendix C in the 
online appendices which are available as part of the online 
version that can be found at http://or.pubs.informs.org/.). 
In §5 we develop lower and upper bounds on the optimal 
inventory levels and show how to use them to improve any 
policy. Section 6 contains our computational results. Online 
Appendix A contains a very simple illustrative example for 
the case of integer-valued demand. In Online Appendix B 
we present a detailed description of the scenarios tested in 
the computational results. 

2. Capacitated Peri dic-Review 
St chastic Invent ry C ntr l Pr blem 

In this section we provide the mathematical formulation of 
the capacitated periodic-review stochastic inventory problem 
and introduce some of the notation used throughout the 
paper. As a general convention we distinguish between a 
random variable and its realization using capital letters and 
lower-case letters respectively. Script font is used to denote 
sets. We consider a finite planning horizon of T periods 
numbered t = 1 T (note that t and T are both deterministic 
unlike the convention above). The demands over 
these periods are random variables denoted by D1 DT . 

As part of the model we assume that at the beginning 
of each period s we are given what we call an inforation 
set that is denoted by s . The information set s 
contains all of the information that is available at the 
beginning of time period s. More specifically the information 
set s consists of the realized demands (d1 ds−1) 
over the interval [1 s) and possibly some more (external) 
information denoted by (w1 ws ). The information 
set s in period s is one specific realization in the 
set of all possible realizations of the random vector Fs = 

D1 Ds−1 W1 Ws . This set is denoted by s . 
In addition we assume that in each period s there is a 
known conditional joint distribution of the future demands 
(D ) denoted by I = I which is determined 

sDT sss 

by (i.e. knowing we also know I ). For ease

ss ss 
of notation Dt will always denote the random demand in 
period t according to the conditional joint distribution Is for 
some s t where it will be clear from the context to which 
period s we refer. We will use t as the general index for 
time and s will always refer to the current period. The only 
assumption on the demands is that for each s = 1 T 
and each ∈ the conditional expectation E D · is

ss ts 

well defined and finite for each period t s. In particular 
we allow nonstationarity and correlation between the 
demands of different periods. 

In the periodic-review stochastic inventory control problem 
our goal is to supply each unit of demand while 
attempting to avoid ordering it either too early or too late. 
In period t (t = 1 T ) three types of costs are incurred 
a per-unit ordering cost ct for ordering up to ut units 
where ut 0 is the available order capacity in period t; 
a per-unit holding cost ht for holding excess inventory from 
period t to t + 1; and a per-unit backlogging penalty pt 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

that is incurred for each unsatisfied unit of demand at the 
end of period t. Unsatisfied units of demand are usually 
called backorders. Backorders accumulate fully over time 
until they are satisfied. That is each unit of unsatisfied 
demand will stay in the system and will incur a backlogging 
penalty in each period until it is satisfied. In addition 
there is a lead time of L periods between the time an order 
is placed and the time at which it actually arrives. We first 
assume that the lead time is a known integer L. In Online 
Appendix C we show that our policy can be modified to 
handle stochastic lead times under the assumption of no 
order crossing (i.e. any order arrives no later than those 
placed later in time). In §4.1 we show that extensions to 
the case of random capacities are straightforward. 

There is also a discount factor 1. The cost incurred 
in period t is discounted by a factor of t . Because the horizon 
is finite and the cost parameters are time-dependent we 
can assume without loss of generality that = 1. We also 
assume that there is no speculative motivation for holding 
inventory or having backorders in the system. To enforce 
this we assume that for each t = 2 T − L the inequalities 
ct ct−1 + ht+L−1 and c ct+1 + p are maintained

tt+L 
(where cT +1 = 0). (If there is a discount factor we require 
that c ct−1 + Lh and c ct+1 + Lp .) We 

tt+L−1 tt+L 

also assume that the parameters h p and c are all

tt t 
nonnegative. Note that the parameters hT and pT can be 
defined to take care of excess inventory and backorders at 
the end of the planning horizon. In particular pT can be 
set sufficiently high so as to ensure that there are very few 
backorders at the end of period T . 
The goal is to find a feasible ordering policy (i.e. one 
that respects the capacity constraints) that minimizes the 
overall expected discounted ordering cost holding cost and 
backlogging cost. We consider only policies that are nonanticipatory 
i.e. at time s the information that a feasible 
policy can use consists only of s and the current inventory 
level. 
Throughout the paper we will use Dst to denote the 
accumulated demand over the interval [st] i.e. Dst = 
t 

j=s Dj . We will also use superscripts P and OPT to refer 
to a given policy P and an optimal policy respectively. 

Given a feasible policy P we describe the dynamics of 
the system using the following terminology. We let NIt 
denote the net inventory at the end of period t which can be 
either positive (in the presence of physical on-hand inventory) 
or negative (in the presence of backorders). Because 
we consider a lead time of L periods we also consider the 
orders that are on the way. The sum of the units included in 
these orders added to the current net inventory is referred 
to as the inventory position of the system. We let Xt be the 
inventory position at the beginning of period t before the 

 t−1

order in period t is placed i.e. Xt = NIt−1 + j=t−L Qj 
(for t = 1 T ) where Qj denotes the number of units 

 


ordered in period j (we will sometimes denote tj −
= 
1 
t−L Qj 
by Qt−Lt−1 ). Similarly we let Yt be the inventory position 
after the order in period t is placed i.e. Yt = Xt + Qt . 

Note that once we know the policy P and the information 
set ∈ we can easily compute nis−1 x and y where

ss ss 
again these are the realizations of NIs−1 Xs and Ys respectively. 


3. Marginal C st Acc unting Scheme 

In this section we present a arginal cost accounting 
sche e for stochastic inventory control problems with 
capacity constraints on the size of the order in each 
period. This extends and generalizes the marginal cost 
accounting discussed by Levi et al. (2007). Because this 
cost-accounting approach is central for our approximation 
results we explain it in detail repeating some of the ideas 
of that paper. Our approach differs significantly from the 
traditional cost-accounting approaches which are based on 
standard dynamic programming. 

We start by reviewing their cost-accounting approach 
which is called arginal cost accounting. The main idea 
underlying this approach is to account for all the expected 
costs associated with the decision of how many units to 
order in period t when this decision is made. More specifically 
the decision in period t is associated with all the 
expected costs that after that decision is made become 
unaffected by any future decision and are only dependent 
on future demands. In Levi et al. (2007) it was shown that 
in uncapacitated models these costs are relatively easy to 
compute already in period t even though they may include 
costs that are going to be incurred only in future periods. 
Taking this approach Levi et al. have proposed a marginal 
holding-cost accounting scheme. Their approach is based 
on the convention that units in inventory are consumed on 
a first-ordered-first-consumed basis. This implies that the 
overall holding cost of the qs units ordered in period s 
(i.e. the holding cost they incur over the entire horizon 

sT ) is a function only of future demands and is independent 
of any future decision. Based on the assumption that 
inventory is consumed on a first-ordered-first-consumed 
basis the qs units on order will be used to satisfy demand 
only when the xs units presently in the system have been 
completely consumed. Among these qs units the number 
of those still remaining in inventory at the end of period j 
(where j s + L) is precisely qs − Ds j − xs ++ . 
Each of these units incurs a cost of hj . More specifically 
conditioning on an information set s ∈ s the 
marginal holding cost is defined to be (assuming again 

 T

that = 1) hj q − D − x ++ . Observe again

j=s+L s sjs 

that for each nonanticipatory policy P if conditioned on 
some t ∈ t the inventory position at the beginning of 
period t denoted by xtP is known deterministically. In addition 
once the order in period s is determined the backlogging 
cost a lead time ahead in period s + L i.e. 
pD − x + q + is also dependent only on the

s+L ss+L ss 

future demands. This leads to a marginal cost accounting. 
For each feasible policy P let Ht 
P be the ordering and 
holding cost incurred over the interval tT by the Qt 
P 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

units ordered in period t (for t = 1 T ) andlet P 
t 
be the backlogging cost incurred a lead time ahead in 
period t + L (t = 1 − LT − L). That is Ht 
P = ctQt 
P + 

 T 

+ P 

j=t+L hj Qt 
P − Dt j − Xt 
P + and t = pt+LDt t+L − 
XP + QP + 

(where Dj = dj with probability one and

t+Lt 

QP is given as an input for each j 0). Let P be

j = qj 

the cost of the policy P. Clearly 

0 T −L 

 

P HPP

 P =+ H −+ + (1)

tT tt 
t=1−Lt=1 

where H − T denotes the total expected holding cost 
incurred over the interval 1 T by units ordered before 

 0

period 1. We note that the first two expressions P 

t=1−Lt 
and H − T are not affected by our decisions (i.e. they 
are the same for any feasible policy and each realization 
of the demands). Note that without loss of generality we 
can assume that Qt 
P = Ht 
P = 0 for any policy P and each 
period t = T −L+1 T because nothing that is ordered 
in these periods can be used within the given planning 
horizon. 
In models with no capacity constraints there is a fundamental 
difference between holding cost and backlogging 
cost. In particular any mistake of ordering “too little” can 
be fixed in the next period to avoid further backlogging 
cost. In particular the decision of how many units to order 
affects the backlogging cost in a single period. However 
the effect of this decision if we have ordered “too much ” 
may last for a number of periods depending on the realized 
future demands. That is no future decision can fix this 
mistake because we cannot order a negative quantity. Consequently 
P 
t only accounts for costs incurred in a single 
period namely backlogging cost in period t + L and Ht 
P 
accounts for holding costs incurred over multiple periods. 
By way of contrast in models with capacity constraints on 
the size of the order in each period the above observation 
is no longer valid. More specifically because of the 
capacity constraints it is no longer true that a mistake of 
ordering “too little” in the current period can always be 
fixed by decisions made in future periods. 

3.1. Marginal Backl gging C st Acc unting 

We now present a new backlogging cost accounting that 
associates with the decision of how many units to order in 
period s what we shall call forced backlogging cost resulting 
from this decision in future periods. 

Consider some period s. Suppose that xs is the inventory 
position at the beginning of period s and that the number of 
units ordered in the period is qs <us. Let q s be the resulting 
unused slack capacity in period s i.e. q = u − q> 0.

s ss 

Focus now on some future period t s + L when this order 
arrives and becomes available. Suppose that for some realization 
of the demands we have that d − x + q +

st ss 

 


j∈ st−Luj > 0. This implies that there exists a shortage 
in period t and moreover even if in every period after 
period s and until period t − L the orders had been up to 
the maximum available capacity this part of the shortage 
in period t would still exist and incur the corresponding 
backlogging cost. The actual shortage may be even big

 


ger and equal to ds t − xs + qs + j∈ st−Lqj > 0 (recall 
that qj uj for each period j). In other words given our 
decision in period s this part of the shortage could not be 
avoided by any decision made over the interval st − L 
(clearly any order placed after period t − L will not be 
available by time t). We conclude that if more units had 
been ordered in period s then at least some of the shortage 
in period t could have been avoided. More precisely 
the maximum number of units of shortage that could have 
been avoided by ordering more units in period s is equal 

 


to min q s dst − xs + qs + j∈ st−L uj + . The intuition 
is that by ordering more units in period s we could 
have averted part of the shortage in period t but clearly not 
more than the unused slack capacity q s because we could 
not have ordered in period s more than additional q s units. 
In this case we would say that this part of the backlogging 
cost in period t was forced by the decision in period s and 
hence period s is associated with a backlogging penalty 

 
+

of pt min q s dst − xs + qs + j∈ st−L uj . This is 
significantly different from the traditional backlogging cost 
accounting in which this cost would be associated with 
period t − L. 

We let Wst be the shortage in period t that is forced by 
the decision in period s (where again s t − L) i.e. 

 
+ 


 


Wst = min Qs Dst − Xs + Qs + uj 
j∈ st−L 

An alternative way to express Wst using min ab += 
b +− +

b − a for a ∈ R+ and b ∈ R is 

 
+ 

 


W = D − X + Q + uj

st stss 
j∈ st−L 
+ 

 


− Ds t − Xs + uj (2) 
j∈ st−L 

 


Now using the equalities NIt = Xs +Qs + j∈ st−L Qj − 
Ds t (for each s t − L) and uj = Qj + Qj (for each j = 
st − L) we conclude that Equation (2) can be written 
as 

 + + 

 


Dt −NIt − Qj − Dt −NIt − Qj (3) 
j∈ st−Lj∈ st−L 

To see why (2) (and hence (3)) holds observe that 

 
+

Ds t − Xs + Qs + j∈ st−L uj >Qs if and only if 

 


D − X ++ > 0. Next we describe sev


st sj∈ st−L uj 
eral properties of the parameters Wst. Clearly if Qs = 0 
(i.e. Q = u ) then W = 0 for each t s + L. It is also

ss st 

readily verified from (3) that if Wst > 0 for some s t − L 
then we have Wjt = Qj for each j ∈ st − L . 


For each s = 1 − LT − L let P be the overall Figure 2. Forced marginal backorder cost accounting.

s 

forced backlogging cost in periods s + LT associated 

P TWP 100 

with period s i.e. = (we again assume 

st=s+L pt st 

90 

Capacity 
qs 
Forced marginal backlogging cost at s 
Cumulative demand 
Cumulative supply 

that Dj = dj with probability one for each j 0). Let 

80 

u−L = q−L = 0 and q −L = and also define for each 

t = 1 T 

 
+ 


Demand/supply 

70 

60 
50 
40 

30 
20 

W−Lt = D 1−Lt − x1−L + 

INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

uj 

j∈ 1−Lt−L 

 
+ 


= Dt − NIt − Qj 
j∈ 1−Lt−L 

 T

and P 
−L =−L = t=1 ptW−Lt. The last definition of 

−L is meant to account for the forced backlogging cost 
which is independent of any decision and is forced by 
the demands on any feasible policy. It is now readily 
verified that for each t = 1 T and for each pol

 


P + t−L

icyP wehave t−L = pt Dt − NIP 
t = pt j=−L Wjt 
P (the 
sum t−L is telescopic). This implies the following 

j=−L Wjt 

theorem. 

 heorem 1. Let P be a n nanticipat ry p licy. Then, the 

 0

c st f p licy P can be expressed as P = P +

t=−Lt 

 T −L

+ HP + P .

H − Tt=1 tt 

Note that the first two terms of P in Theorem 1 

 0 

P and H − T are independent of any decision we

t=−Lt 

make and are common to all feasible policies. Recall that 

 0 

P represents the forced backlogging penalty that

t=−Lt 

is forced on any feasible policy. Because these two terms 
are also nonnegative we omit them from the analysis. This 
does not impact our approximation results. From now on 
we will write the cost of a feasible policy P as P = 

 T −LHPP

+ . In Online Appendix A we provide an

t=1 tt 

illustrative example of our new cost-accounting approach. 

The intuition is that once a shortage is incurred in 
period t it is allocated to past periods s t − L in which 
the orders were below the available capacity. More specifically 
the shortage and the resulting backlogging cost in 
period t are charged to periods s t − L with positive 
unused slack capacity going backward in time from 

Figure 1. Period-by-period backorder cost accounting. 

100 
90 

10 
0 

… 
Time 

1 s–3 s–2 s–1 s 

period t − L. Each period s t − L can be charged with a 
part of the backlogging cost in period t for up to q s units 
the unused slack capacity in period s. 

Figures 1 2 and 3 graphically illustrate the difference 
between classical period-by-period accounting and forced 
marginal accounting for backlogging costs. All three figures 
reflect a single sample path of demands and orders. The 
total backlogging cost over the horizon is the area above the 
cumulative supply curve (thick line) and below the cumulative 
demand curve (thin line). Classical period-by-period 
accounting assigns to period s the difference between the 
curves at s (see Figure 1). Forced marginal accounting of 
backlogging costs assigns to period s all of the backlogging 
costs that were “forced ” or made inevitable because 
we did not order to capacity in period s. This corresponds 
to the area inside of the trapezoid shown in Figure 2. This 
trapezoid is created by extending the cumulative supply 
curve starting at s− 1 and at s to the right at a slope equal 
to the capacity of the system. These lines represent what 
the supply curves would look like if our policy consistently 
ordered at full capacity from s − 1 and s onwards respectively. 
In fact consider the thick short bars in the trapezoid 
in Figure 2. The first and second terms of (2) are the 
vertical coordinates of the end points of these bars. Conse-

Figure 3. Allocation of a period backorder to ordering 
decisions. 

Forced marginal backlogging cost at sattributed to s–2, s–1, s 

100 
90 

1 s–1 
Ws–2 s 
Ws–1 s 
WssCapacity 
Cumulative demand 
qs 
Cumulative supply 
Capacity 
Cumulative demand 
Cumulative supply qs 
Classical backlogging cost at s 
12 3… s–1 s 


80

80 

Demand/supply 

Demand/supply 

70 
60 
50 

40 

70 

60 

50 

40 

30 

30 

20 

20 

10 

10 

0 

0 

… s–3 s–2 s 
Time Time 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

quently each Wst for t>s is the length of one of these bars. 
Figure 3 takes a different point of view. It considers the 
backlogging costs incurred in period s and illustrates how 
those costs are allocated to periods ss − 11 −L. 

In Levi et al. (2007) it is shown that the marginal holding 
cost consists of a sum of partial expectations. Once 
xs is known at time s the summands are expectations 
of simple piecewise-linear functions. If the accumulated 
demand Ds j (for each j s) has any of the distributions 
that are commonly used in inventory theory (e.g. Normal 
Gamma Lognormal Laplace etc.) (Zipkin 2000) then it 
is extremely easy to evaluate these terms. If the distribution 
of Ds j is discrete these functions can be computed recursively 
in efficient ways using the c.d.f.s. More generally 
the complexity of evaluating the marginal holding cost can 
vary depending on the level of information we assume on 
the demand distributions and their characteristics. In all of 
the common scenarios there exist straightforward methods 
to solve this problem efficiently (see also Hurley et al. 
2006 for more details). Because in the presence of positive 
lead times even computing a simple myopic policy requires 
the same knowledge on the distribution of the accumulated 
demand over the lead time the computational effort 
involved with computing the marginal holding cost is of 
the same order of magnitude as for the myopic policy. 
Evaluating the marginal backlogging costs based on the 
scheme developed in this paper is analogous to the marginal 
holding cost. It is a sum of partial expectations of simple 
piecewise-linear functions and therefore is no more difficult 
to compute. 

Finally observe that for uncapacitated models with 
us = for each s (and hence q s = ) our backlogging 
cost accounting is in fact identical to the traditional backlogging 
accounting discussed above. This implies that the 
cost-accounting scheme proposed in this paper is a generalization 
of the one introduced in Levi et al. (2007). Therefore 
the preceding discussion is also a generalization of the 
corresponding algorithm and analysis in Levi et al. (2007). 

4. Dual-Balancing P licy 

In this section we describe a new policy for the capacitated 
periodic-review stochastic inventory control problem. 
As in Levi et al. (2007) we call it a dual-balancing policy. 
We shall show that this policy has a worst-case performance 
guarantee of two i.e. for each instance of the 
problem the expected cost of the policy is at most twice the 
expected cost of an optimal policy. Recall the assumption 
discussed in §2 that the cost parameters imply no motivation 
for holding inventory or backorders. This implies that 
without loss of generality for each t = 1 T ct = 0 
and ht pt 0. Moreover we first describe the algorithm 
its analysis and several extensions under the latter assumption. 
Then in §4.2 we discuss in detail the generality of 
this assumption. 

The dual-balancing policy presented in this paper is 
based on a balancing idea similar to the one used in 
Levi et al. (2007) for the uncapacitated model. That dual-
balancing policy balances in each period s and conditioned 
on the observed information set s the expected marginal 
holding cost of the units ordered in the period against the 
expected (traditional) backlogging cost in period s + L 
a lead time ahead of s. However it is readily seen that this 
approach does not work in the case where there is a capacity 
constraint on the size of the order in period s. For one 
the order size qs that balances these two costs might not be 
reachable when qs >us. 

In turn we consider the forced marginal backlogging 
cost accounting and the corresponding cost it associates 
with period s as described in §3 above. Conditioned on the 
observed information set s we now balance the expected 
marginal holding cost of the units ordered in period s 
against the expected arginal backlogging costs associated 
with period s. We will use the superscript B to refer to the 
dual-balancing policy. For each period s = 1 T − L 
conditioning on the observed information set s let ls 
B qs 
B 
be the expected holding cost incurred over sT by the 
units ordered by the dual-balancing policy in period s. 
That is lB qB = E HB qB · . In §3 we have defined 

ss sss 

 T

HB QB ++

=− D − XB (recall that we 

sj=s+Lhjt sj s 

B BB

assume c = 0). In addition let q · be

ss = E ss s 
the expected backlogging cost associated with period s by 
the forced marginal backlogging cost-accounting scheme 
described above again conditioned on the observed information 
set s. Recall that in §3 we have defined s 
B = 

 T 

WB where

t=s+L pt st 

 
+ 


 


WB = min QB D − XB + QB +

st sst ssuj 
j∈ st 
+ 

 


= D − XB + QB + uj

st ss 
j∈ st−L 
+ 

 


− D − XB +

st suj 
j∈ st−L 

If we condition on s the inventory position at the beginning 
of period s xs 
B is known deterministically. That is 

B BB

it is clear that lB q and q are both indeed functions 

ss ss 

of qs 
B the number of units ordered in period s. 

We first discuss the case where the orders are 
allowed to be fractional. This implies that the functions 

lBB BB

q and q are continuous. In each period

ss ss 

s = 1 T − L given the observed information set s 
the dual-balancing policy will order qB = q u units 

s ss 

such that the expected marginal ordering and holding cost 
incurred by these units over sT is equal to the expected 
forced marginal backlogging cost associated with period s. 
In other words we order q units such that lB q = 

s ss 

E HB q · =	 B q = EB q · . Next we show 

sss ss sss 

that this policy is well defined. It is readily verified that 

lBB B

q is a convex increasing function of q that is equal to 

ss s 
zero for qs 
B = 0 and goes to as qs 
B goes to . Similarly 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

one can verify that s 
B qs 
B is a decreasing convex function 
of qs 
B that has a nonnegative value at qs 
B = 0 and that is 
equal to zero for qs 
B = us (in this case there is no unused 
slack capacity at s and q s 
B = 0). Our assumption that these 
functions are continuous implies that qs as defined above 
always exists. 

Computationally qs is the minimizer of the function 

B BBB

gs q = max &lB qq which is a convex func


s ss ss 
tion of qs 
B because it is the maximum of two convex functions. 
Hence in each period s we need to solve a convex 
minimization problem of a single variable. In particular if 
for each j s Ds j is distributed according to any of those 
distributions that are commonly used in inventory theory 
then it is extremely easy to evaluate the functions ls 
B qs 
B 
and s 
B qs 
B . More generally the complexity of the algorithm 
is of order T (i.e. number of time periods) times the 
complexity of solving the single variable convex minimization 
defined above. The complexity of this minimization 
problem can vary depending on the level of information 
we assume on the demand distributions and their characteristics. 
In all of the common scenarios there exist 
straightforward methods to solve this problem efficiently. 
In particular qs is determined by the intersection of two 
monotone convex functions which suggests that bisection 
methods can be effective in computing qs . We note that the 
dual-balancing policy is not a state-dependent base-stock 
policy. However it can be computed in an online anner 
i.e. computing the policy action in period s does not 
require any knowledge of the future decisions to be made 
in the next periods. Moreover unlike the myopic policy the 
dual-balancing policy does use available information about 
long-term future demands. 

4.1. Analysis 

Next we show that for each instance of the problem the 
expected cost of the dual-balancing policy described above 
is at most twice the expected cost of an optimal policy. 
We will use the marginal cost-accounting scheme described 
in §3 and amortize the period cost of the dual-balancing 
policy with the cost of the optimal policy. 

Using the marginal cost-accounting scheme discussed 
in §3 the expected cost of the dual-balancing policy can 

 T −L

be expressed as E B = E HB + B . For each

t=1 tt 

t = 1 T − L let Zt be the rando balanced cost by 
the dual-balancing policy in period t i.e. Z = E HB · .

t tt 
Note that Zt is a function of the observed information set 
in period t. In the next lemma we obtain an expression 
for the expected cost of the dual-balancing policy using 
the Zt variables. The proof is identical to the proof of 
Lemma 4.1 in Levi et al. (2007). 

Lemma 1. The expected c st f the dual-balancing p licy 
is equal t twice the expected sum f the Zt variables, i.e., 

 T −L

= 2E Z .

E Bt=1 t 

In the next two lemmas we show that the cost of OPT 
can be amortized against some of the cost of the dual-
balancing policy. In particular they imply that the expected 

 T −L

cost of OPT is at least E Z . For each realization

t=1 t 
of the demands D1 DT let H be the set of periods 
t = 1 T − L in which the optimal policy had inventory 
position higher than that of the dual-balancing policy i.e. 

<Y OPT

the set of periods 1 t T − L such that Yt 
B 
t . 
Let be the set of periods in which the dual-balancing 
had inventory position at least as high as that of OPT i.e. 

 Y OPT

the set of periods t = 1 T − L such that Yt 
B 
t . 
(We consider only the periods t = 1 T − L because 
the effective ordering decisions are made in these periods. 
Specifically each order placed after period T −L will arrive 
after period T .) Observe that H and are random sets 
that induce a random partition of the horizon. 

The next lemma shows that with probability one the 
marginal holding cost incurred by the dual-balancing policy 
in periods t ∈ H is at most the total holding cost 

 


 HOPT

incurred by OPT denoted by HOPT i.e. HB 

t∈ Ht 

with probability one. The proof is identical to the proof of 
Lemma 4.2 in Levi et al. (2007). 

Lemma 2. F r each realizati n T ∈ T , the t tal marginal 
h lding c st incurred by the dual-balancing p licy f r all 
f the peri ds t ∈ H is at m st the t tal h lding c st 

 


HB HOPT

incurred by OPT, den ted by HOPT , i.e., 

t∈ Ht 

with pr bability ne. 

The next lemma shows that with probability one the 
marginal backlogging cost of the dual-balancing policy 
associated with periods t ∈ is at most the total backlogging 
penalty incurred by OPT denoted by OPT. 

Lemma 3. F r each realizati n T ∈ T , the t tal marginal 
backl gging c st f the dual-balancing p licy ass ciated 
with all f the peri ds t ∈ is at m st the t tal backl 
gging penalty incurred by OPT, den ted by OPT , i.e., 

 


B OPT with pr bability ne. 

t∈ t 

The forced marginal backlogging cost associated with 
the periods in is equal to 

 


pWB = pWB 

tstt st 
s∈ tt s+L ts∈ s t−L 

Therefore it is sufficient to show that for each t = L + 
1 T the traditional backlogging cost incurred by OPT 
in that period is at least as much as the forced backlogging 
costs incurred by the dual-balancing policy in period t as 
a result of decisions made in periods &s ∈ s t − L' . 
In other words it is sufficient to show that for each t = 
L + 1 T wehave 

 


− NIOPT + WB

D 

tt st 
s∈ s t−L 

with probability one. (Recall that the backlogging costs 
over the periods 1 L are the same for all policies.) 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

Consider now a specific realization T ∈ T and 
some period t =1 T . If there is no period in 

&s ∈ s t −L' with wst 
B > 0 then there is nothing to 
prove. Assume that such a period s exists and let sl and se 
be the latest and the earliest periods in the set &s ∈ s 
t −L wst 
B > 0 respectively (it is possible that sl =se). We 
note again that here we abuse our notation and consider the 
set as the realized set of periods according to the specific 
realization T . In particular se and sl are the respective 
realizations of random variables Se and Sl. We have already 
seen (in the discussion in §3) that for each s ∈ se sl we 

 

BBB B

have wst = qs and wse t ds t −xse +qse + j∈se t−L uj . 
Indeed 
e 

 


 


−niOPT OPT OPT

dt t =dt − y + qj −dsl t

sl 
j∈sl t−L 

 


 


 dsl t − ys 
B 
l + uj 

j∈sl t−L 

 


 


=d − y B + q B −d + uj

ss 
j∈ssl j∈sl t−L 

slt e jesl 

e 

 


 

BB B

=d − x +q + uj + q 

setss j

ee 

st−Lj∈s sl

ee 

 
B 
B

 w w

st st 
j∈ssl j∈s sl ∩ 

ee 

The first equality is again based on the fact that for each 
feasible policy and for each s t wehaveNIt 
=Ys + 

 


j∈st−L Qj −Ds t applied to OPT and periods sl t −L. 
The first inequality follows from the assumption that 

OPT B

sl ∈ and so y y and from the capacity constraints 

sl sl 

that imply qOPT uj . The second equality follows from

j 
the fact that (for each s s ) Ys =Ys + j∈ss Qj −Ds s 
applied to the dual-balancing policy and periods se sl. 
The last equality is achieved by adding and subtracting 

 
B 

j∈sq and from the fact that uj =Qj +Qj . The proof 

esl j 

then follows. 

As a corollary of Lemmas 1 2 and 3 we get the following 
theorem. 

 heorem 2. The dual-balancing p licy has a w rst-case 
perf rmance guarantee f tw , i.e., f r each instance f 
the capacitated peri dic-review st chastic invent ry c ntr l 
pr blem, the expected c st f the dual-balancing p licy is 
at m st twice the expected c st f an ptimal s luti n, i.e., 

E B 2E OPT . 

From Lemma 1 we know that the expected cost of the 
dual-balancing policy is equal to twice the expected cost of 

 T −L

the sum of the Z variables i.e. E B = E Z .

tt=1 t 

From Lemmas 2 and 3 we know that with probability 
one the cost of OPT is at least as much as the holding 
cost incurred by units ordered by the dual-balancing 
policy in periods t ∈ H plus the forced marginal backlogging 
cost of the dual-balancing policy that is associated 
with periods t ∈ . In other words with probability one 

 


HOPT + OPT B 

t∈ H Ht 
B + t∈ t . Again using conditional 
expectations and the definition of Zt this implies 
that indeed 

E OPT 

 


 
HB 
B

 E t + t 
t∈ Ht∈ 

 
B

= E Ht 
B · t ∈ H + t · t ∈ 
t 

 


= EE HB · t ∈ H + B · t ∈ · 

t tt 
t 

 


= E t ∈ H + t ∈ Zt = E Zt 
tt 

We note that if the optimal policy is deterministic (i.e. 
it makes deterministic decisions in each period t given the 
observed information set t) then if we condition on t 

B OPT

then yt and yt are known deterministically and so are the 
indicators t ∈ H and t ∈ . If the optimal policy is 
random then the same arguments above still work. We now 
need to condition not only on t but also on the decisions 
made by the policies. Because the inventory control policy 
does not have any effect on the evolution of the demand the 
arguments above are still valid. This concludes the proof 
of the theorem. 

We note that the examples discussed in Levi et al. 
(2007) show that the above analysis is tight. However the 
analysis hints that in a typical scenario the performance 
would be significantly better. Hurley et al. (2006) present a 
thorough empirical analysis of the typical performance of 
dual-balancing policies in uncapacitated models. In §6 we 
present empirical results that confirm that this phenomenon 
extends to the capacitated case. 

Finally we note that the dual-balancing policies and the 
worst-case analysis can be extended to models where the 
capacities in each period are generated by some exogenous 
random process and the exact capacity available in period t 
is observed only at the beginning of the period. Thus 
the dual-balancing policies provide a worst-case guarantee 
of two for this important extension as well. In this 
case the expectations of the marginal backlogging costs 
are taken with respect to both the random future demands 
and random future capacities. In Online Appendix C we 
consider two extensions of the dual-balancing policy and 
the worst-case analysis. Specifically we discuss the extensions 
to models where orders must be integral and the 
demands are integer-valued random variables and to models 
with stochastic lead times under the no-order-crossing 
assumption. 

4.2. C st Transf rmati n 

In this section we discuss in detail the cost transformation 
that enables us to assume without loss of generality that 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

for each period t= 1 T wehave ct = 0 and ht pt 0. 
Consider any instance of the problem with cost parameters 
that imply no speculative motivation for holding inventory 
or backorders (as discussed in §2). Following Levi et al. 
(2007) we use a simple transformation of the cost parameters 
to construct an equivalent instance with the property 
that for each period t= 1 T wehave ct = 0 and ht 
pt 0. More specifically the modified instance has the 
same set of optimal policies. Applying the dual-balancing 
policy to that instance we obtain a policy that is different 
from the original dual-balancing policy and which also 
has a performance guarantee of at most two with respect 
to the original problem. We shall show that this cost transformation 
can improve the performance guarantee of the 
dual-balancing policy in cases where the ordering cost is 
the dominant part of the overall cost. In practice this is 
often the case. 

We now describe the transformation for the case with 
no lead time (L= 0) and = 1; the extension to the case 
of arbitrary lead time is straightforward. Recall that any 
feasible policy P satisfies for each t= 1 T Qt = NIt− 
NIt−1 +Dt (for ease of notation we omit the superscript P). 
Using these equations we can express the ordering cost in 
each period tas c NI −NIt−1 +D . Now replace NI with

ttt t 
NI+ 
t − NI− 
t its respective positive and negative parts. 
This leads to the following transformation of cost param


ˆ

eters. We let cˆ= 0 h = h + c − cc = 0 

t t ttt+1 T+1 

and pˆ= p − c + ct+1. Note that the assumptions on the

t tt 
cost parameters ct ht and pt discussed in §2 and in particular 
the assumption that there is no speculative motivation 
to hold inventory or backorders imply that hˆ 
t and bˆ 
t above 
are nonnegative (t= 1 T). Observe that the parameters 
hˆ 
t and bˆ 
t will still be nonnegative even if the parameters 
ct ht and pt are negative and as long as the above 
assumption holds. Moreover this enables us to incorporate 
into the model a negative salvage cost at the end of 
the planning horizon (after the cost transformation we will 
have nonnegative cost parameters). It is readily verified that 
the induced problem is equivalent to the original one. More 
specifically for each realization of the demands the cost 
of each feasible policy P in the modified input decreases 

 T

by exactly d (compared to its cost in the original

t=1 ct t 

input). Therefore any optimal policy for the modified input 
is also optimal for the original input. 

Now apply the dual-balancing policy to the modified 
problem. We have seen that the assumptions on ct ht 
and p ensure that hˆ and pˆ are nonnegative and hence the 

t tt 

analysis presented above is valid. Let opt and opt be the 
optimal expected cost of the original and modified inputs 

 T

respectively. Clearly opt = opt + E Dt . Now the

t=1 ct 

expected cost of the dual-balancing policy in the modified 
input is at most 2opt. Its cost in the original input is then 

 T T

at most 2opt + E D = 2opt − E D . This

t=1 ctt t=1 ct t 

 T

implies that if E t=1 ctDt is a large fraction of opt then 
the performance guarantee of the expected cost of the dual-
balancing policy might be significantly better than two. 

 T

For example if E t=1 ctDt 0 5 opt then we can conclude 
that the expected cost of the dual-balancing policy 
is at most 1 5 opt. It is indeed the case in many real-life 
problems that a major fraction of the total cost is due to the 
ordering cost. The intuition of the above transformation is 

 T

that D is a cost that any feasible policy must pay. 

t=1 ct t 

As a result we treat it as an invariant in the cost of any 
policy and apply the approximation algorithm to the rest of 
the cost. 

In the case where we have a lead time L we use the 
equations Qt = NIt+L − NIt+L−1 + Dt+L for each t = 
1 T − L to get the same cost transformation. The 
transformation for >1 is also straightforward. Also it is 
not hard to see that the cost transformation can be modified 
to remove say 1% of the per-unit ordering costs where 
0 <1<100. This leads to a continuum of dual-balancing 
policies all of which are two-approximations. 

5. Impr ved P licy and B unds n 
the Optimal Invent ry Levels 

In this section we consider two semimyopic (modified) 
base-stock policies that are easy to compute in an online 
manner and provide respectively lower bounds and upper 
bounds on the inventory levels of an optimal policy yt 
OPT in 
each period t= 1 T. We believe that these bounds can 
be used effectively to improve existing algorithms for computing 
inventory control policies for the capacitated model 
discussed in this paper and other capacitated stochastic 
inventory models. Moreover as in Hurley et al. (2006) we 
shall show that these policies provide bounds that are strong 
in the following sense: each policy that for some period t 
and some state t has inventory level outside the range 
defined by the respective lower and upper bounds can be 
improved. In particular there is another (modified) policy 
that in period tand state t admits an inventory level within 
the specified range with expected cost no greater than the 
expected cost of the original policy. In other words any 
policy that violates these respective bounds is dominated 
by another policy. We then follow Hurley et al. (2006) 
and construct an i proved dual-balancing policy that incorporates 
these bounds. This policy also has a performance 
guarantee of two and as the computational study for the 
uncapacitated model in Hurley et al. (2006) suggests we 
expect that it will have a better typical performance. 

The policies we consider are called lower- yopic 
(denoted by LM) and upper- yopic (denoted by UM) 
respectively. In the lower-myopic policy in each period s 
conditioning on the observed information set s we minimize 
the su of the expected marginal holding cost of the 
units ordered in that period and the traditional expected 
backlogging costs a lead time ahead. That is in each 
period s we minimize 

LM = lLM + 

gsqs sqs + E ps+LDs s+L − xs + qs · s 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

under the constraint qs us. This is a convex function of qs. 
This policy was first proposed for the uncapacitated model 
by Levi et al. (2007) who called it the ini izing policy. 
They have shown that this is a base-stock policy that provides 
lower bounds on the optimal base-stock levels. However 
in the capacitated model it is possible that the actual 
minimizer will not be attainable. In this case we order up 
to capacity and this provides a modified base-stock policy. 
In this paper we extend and generalize their proof for 
the capacitated model. In the upper-myopic policy in each 
period s again conditioning on s we minimize the sum of 
the expected period holding cost and the expected forced 
marginal backlogging. Thus we minimize 

UM UM + 

gs qs =	 s qs + E hs+L xs + qs − Ds s+L · s 

subject to 0 qs us which is also convex in qs. We shall 
show that this policy provides upper bounds on the inventory 
levels of an optimal policy. By arguments similar to 
the ones used by Levi et al. (2007) it can be shown that 
this gives rise to yet another modified base-stock policy. 

UM1 UM2 1

(In particular gs q − gs q depends only on y = 
xs + q1 and y2 = xs + q2.) To the best of our knowledge 
this is a new way for deriving upper bounds on the inventory 
levels of an optimal policy in the capacitated model. 
We note that it is not clear whether the classical myopic 
policy where we minimize the expected period cost provides 
any bounds for capacitated models. Another similar 
open question is how the policy that in each period minimizes 
the sum of the expected marginal holding cost and 
expected forced marginal backlogging cost is related to an 
optimal policy. 

Let Yt 
LM and Yt 
UM be the respective inventory position 
(after orders are placed) of the lower-myopic and the upper-
myopic policies in period t = 1 T. Specifically we 
assume that Yt 
LM is the smallest minimizer of the corresponding 
period problem being solved (see above) and that 

YUM 

t is the largest minimizer of the corresponding period 
problem. Note that the inventory position levels depend on 
the specific state ( t xt) but for ease of notation we omit 
the indication of the state. The two semimyopic policies 
described above can be implemented in an online manner 
i.e. regardless of the action control in future periods. We 
shall show that for each evolution T these two policies 
provide lower and upper bounds on the inventory levels of 

 YOPT YUM

any optimal policy i.e. Yt 
LM 
tt with probability 
one for each t= 1 T. Moreover we shall show that 

 YUM

each nondominated policy P must have YLM YP 

t tt 

for each t= 1 T. 

The next two lemmas show that each policy P that has 
for some period s and state s inventory position ys 
P 

LM UM

ys ys can be strictly improved by a modified pol-

P LM UM

icy P with y ∈ yy and expected cost at most the 

s ss 

expected cost of P. For the sake of simplicity we consider 
a model with no lead time (the extensions to the case with 
L>0 are straightforward). 

Lemma 4. C nsider a feasible p licy P, and supp se that 
f r s me peri d sand inf rmati n set s, we have ys 
P <ys 
LM . 
Further assume that s is the earliest such peri d. Then, 
the p licy P that f ll ws P until peri d s− 1, then rders 
up t ys 
LM in peri d s and again imitates P ver the interval 
sT , has expected c st n larger than the expected 
c st f P. 

Because P follows P over 1 s we conclude that they 
incur exactly the same cost over that interval and that they 
have the same inventory position x yP <yLM. Because

ss s 

s is the first such period we conclude that P can indeed 
order up to ys 
LM. Now over sT P imitates P; that is it 
orders nothing if XP YP and orders up to YP otherwise

jj j 

(for each j ∈ sT ). Moreover the policy P has ordered 
qs 
P units in period s. Consider the overall expected marginal 
holding cost of these units and the expected (traditional) 
backlogging cost incurred by P in period s. By the definition 
of qs 
P it is clear that this is no greater than the 
expected marginal holding cost and expected (traditional) 
backlogging cost incurred by the policy P in period s.For 
each period j∈ sT we know that with probability one 
YP YP and that QP Qj 
P. This implies that the backlog


jj j 

ging incurred by policy P over that interval is no greater 
than the backlogging cost incurred by policy P and similarly 
the marginal holding-cost policy that P incurs over 
that interval is no greater than the respective marginal holding 
cost of policy P. The lemma then follows. 

Lemma 5. C nsider a feasible p licy P, and supp se 
that f r s me peri d s and inf rmati n set s, we have 
ys 
P >ys 
UM . Further assume that s is the earliest such peri d. 
Then, the p licy P that f ll ws P until peri d s− 1, then 
rders up t ys 
UM in peri d s and again imitates P ver 
the interval sT , has expected c st n larger than the 
expected c st f P. 

By arguments identical to the ones in Lemma 4 we conclude 
that P and P incur the same cost over 1 s and that 
they have the same inventory position x yUM <yP. The

ss s 

first inequality follows from the fact that s is the first period 
in which P has more inventory than the upper-myopic policy. 
Thus P can order up to Ys 
UM and assume that it 
orders qsp . Consider the overall expected forced marginal 
backlogging cost and expected period holding cost incurred 
in period s by policy P . By the definition of qs 
P we conclude 
that this expected cost is smaller than the respective 
expected cost incurred by policy P in period s. Now over 
sT P again tries to imitate P i.e. for each j∈ sT it 
will order up to Yj 
P or up to the capacity uj. Now let S be 
the earliest (random) period after period s in which P has 
reached YS 
P. Clearly over ST the policies P and P are 
again identical and hence incur the same cost. Observe 
that for each j∈ sS wehave YP YP and QP Qj 
P 

jj j 

with probability one. This implies that the expected holding 
cost and the expected forced marginal backlogging penalty 
incurred by policy P over that interval are each no greater 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

than the respective expected cost incurred by policy P. The 
lemma then follows. 

Lemmas 4 and 5 imply the following corollary. 

Corollary 1. F r any ptimal p licy and f r each c mplete 
ev luti n T, the l wer-my pic and upper-my pic 
p licies pr vide respective l wer and upper b unds n 

YLM

the invent ry levels f the ptimal p licy, i.e., t 
YOPT YUM 

tt with pr bability ne f r each t= 1 T. 

Now consider the improved dual-balancing policy 
denoted by superscript IB. In each period s given the 
observed information set s and the inventory position at 
the beginning of the period we still consider balancing 
the expected marginal holding cost against the expected 
marginal backlogging cost and compute qs as described 
in §4. (That is given the observed information set s and 
the inventory position at the beginning of period s ordering 
qs will balance the expected marginal holding cost and 
the expected marginal forced backlogging costs associated 
with period s.) However in each case where the original 
balancing quantity brings the inventory position below ys 
LM 

IB UMIB

(i.e. x +q <yLM)orabove y (i.e. x +q >yUM) we 

sss s sss 
fix this decision by instead increasing the order up to ys 
LM 
or decreasing it down to ys 
UM respectively. It can be readily 
verified that for each evolution T and each period s we 

LM IB UM

have y y y . 

s ss 

We next prove the following theorem. 

 heorem 3. The impr ved dual-balancing p licy has a 
perf rmance guarantee f tw . 

Observe that in the improved dual-balancing policy it 
is no longer true that in each period t the expected marginal 
holding cost is equal to the expected forced marginal 
backlogging cost. Now let Zt be the maximum among the 
expected marginal holding cost and expected forced 
marginal backlogging cost i.e. Z = max E HIB QIB · 

t ttt 

IBIB IB

E t Qt · t where Qt is the order quantity placed 
by the improved dual-balancing policy in period s. (As 
already mentioned Qt 
IB can be either larger or smaller than 
the balancing quantity Qt.) Similar to Lemma 1 we now 

 


conclude that E IB 2 tE Zt . 
Next we modify the definition of the sets H and 
in §4. The set H will consist of periods t= 1 T− L 
YLM <YIB <YUM YIB YOPT

such that (i) and ;or 

t tt tt 

= YLM = YLM = YUM = YOPT

(ii) YIB <YUM; or (iii) YIB ;

ttt ttt t 

and the improved dual-balancing policy orders more than 

 YLM = YUM

the balancing quantity Q. (That is XIB + Q

t ttt t 
and Qt 
IB Qt.) The set will consist of all the other 
periods in t= 1 T− L. Specifically contains peri<
YIB <YUM >YOPT

ods such that (i) YLM and YIB ;or 

t tt tt 

<YIB = YUM = YLM = YUM = YOPT

(ii) YLM ; or (iii) YIB ;

ttt ttt t 

and the improved dual-balancing policy orders less than 
the balancing quantity Q. (That is XIB + Q >YUM = 

t ttt 

YLM IB

and Q <Q.) Note that for each t∈ H wehave 

t tt 

YIB YOPT YOPT 

 and for each t∈ wehave YIB .

tt tt 

Thus the arguments used to prove Lemmas 2 and 3 are 

still valid. It is then sufficient to show that for each t∈ H 
we have E HIB QIB · = Z and for each t∈ we 

tt t t 

IB IB

have E Q · = Zt. This will imply that the argu


tt t 

ments in the proof of Theorem 2 are still valid and the 
performance guarantee of the policy then follows. 

Assume now that for some t ∈ t such that t∈ H we 
have E HIB qIB · <zt. However this can happen only 

tt t 

if in that period the improved dual-balancing policy orders 

= YUM

below Q and YIB . (The improved dual-balancing

t tt 

IB >YUM

policy orders Q <Q only when XIB + Q and

tt ttt 

= YUM

then it decreases the order until Yt 
IB 
t .) This leads to 
a contradiction because by definition t∈ (see cases (ii) 
and (iii) in the definition of above). 

Similarly assume that for some t ∈ t such that t∈ 

IB IB

we have E t Qt · t <zt. This can happen only if 
in that period the improved dual-balancing policy orders 

IB <YLM = YLM

Q >Q (i.e. XIB + Q ) and YIB . However 

tt ttt tt 
we again get a contradiction because by definition t∈ H 
(see cases (ii) and (iii) in the definition of H above). This 
concludes the proof of the lemma. 

6. C mputati nal Experiments 

As we mentioned in the introduction due to state space 
explosion the corresponding inventory control models are 
very difficult from a computational perspective. Consequently 
we study the typical performance of the balancing 
policies in two settings. In the first setting the optimal solution 
of the capacitated inventory system is easily computed 
but there is no evolution of forecasts (i.e. demands are 
independent over time). This enables us to see how close to 
optimal the balancing policy is in at least one setting. The 
second experiment is more realistic in that the demand and 
forecast evolution processes are governed by the multiplicative 
MMFE model. In fact these are the settings in which 
balancing policies are most attractive because optimal policies 
are inaccessible and no provably good heuristics or 
even reasonable lower bounds are available. As a result we 
benchmark the performance of the balancing policies using 
the myopic and the other semimyopic policies developed 
in this paper in §5. In these experiments the balancing 
policies were very robust. For the model with independent 
demands the dual-balancing policy came within 11% of 
the optimal cost on average within 17% of optimal in 95% 
of the trials and never exceeded the optimal cost by more 
than 25%. Moreover the balancing policies outperform the 
myopic policy by 49% in the first experiment and by 27% 
in the second on average. (In many scenarios the balancing 
policies improve upon myopic by more than 50%.) This 
indicates that the typical performance of the balancing policies 
is significantly better than the worst-case guarantees. 

6.1. Experiments with Translated-Mass 
Exp nential Demand Distributi ns 

In this experiment we consider infinite-horizon problems 
with i.i.d. demand i.e. the distribution of Dt · s 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

Figure 4. Sensitivity of performance to capacity backorder 
costs and demand variance. 

1.00 
1.05 
1.15 
1.10 
3.25 
2.75 
2.25 
1.75 
1.25 
1.20 
1 
32 
1 
16 
1 
8 
1 
4 
1 
2 
1 2 4 816 32 64 
Myopic/opt vs. capacity – 1 Bal/opt vs. backorder$/holding$ 
Bal/opt vs. capacity – 1 Myopic/opt vs. demand variance 
Myopic/opt vs. 
backorder$/holding$ 
Bal/opt vs. demand variance 


is independent of both s and t. We assume that Dt 
has a translated-mass exponential distribution meaning 
that P Dt >x = 1if x<a and otherwise P Dt >x 

− x−a

= qe + where 0 q 1 3> 0 a 0 and 
a· 1 − q = 0. If q= 1 then Dt has an exponential distribution 
translated to the right by aunits. If q<1 then 
a= 0 Dt = 0 with probability 1−q and with probability q 
Dt follows an exponential distribution. For every positive 
mean and variance there is a unique translated-mass exponential 
distribution. 

For infinite-horizon problems with translated-mass exponential 
demand a stationary order-up-to policy is optimal. 
The optimal policy and its cost are easily obtained using 
the following observation: for translated-mass exponential 
demand the lower and upper bounds in Theorem 2 of 
Glasserman (1997) coincide. 

The demand Dt has mean one. We start with a base case 
in which Dt has variance one the capacity is 1.5 and the 
backorder cost per day is eight times larger than the holding 
cost. Figure 4 illustrates what happens when we fix 
two of these parameters and vary the third one. On the 
vertical axis we show the ratio of the cost of the balancing 
policy to the optimal cost and the ratio of the myopic 
cost to the optimal cost. Note that the scale on the vertical 
axis is not uniform. For the solid lines the horizontal axis 
displays the excess capacity (i.e. the capacity minus the 
mean demand or “capacity − 1”). For the dashed lines the 
horizontal ordinate is the ratio of the backorder cost per 
day to the holding cost. For the dotted lines the horizontal 
ordinate is the variance of the demand. 

Figure 5. Histogram of cost as a fraction of optimal 
cost. 

700 
600 
500 
400 
300 
200 
100 
0 

Bal$/opt$ 
Myopic$/opt$ 
12345 

In addition we randomly generated 1 000 problem 
instances using a mean demand of one. The capacity 
the backorder-to-holding-cost ratio and the standard 
deviation of the demand are all randomly generated 
from translated beta distributions. For the capacity the 
distribution has minimum maximum mean and standard 
deviation equal to 1 05 3 3 1 61 0 32 . For the 
backorder-to-holding-cost ratio and the standard deviation 
of the demand the corresponding values are 

11012600 1443 and 01 36 098 051 .Thecomputations 
were done using JAVA on a standard PC and 
computing the balancing decision in each period took 
0.00015 seconds on average. 

Figure 5 shows histograms of the ratio of the balancing 
policy’s cost and the optimal cost and the ratio of the 
myopic policy’s cost and the optimal cost. Figure 6 is a 
restricted view of Figure 5 with a finer grid limited to 
the neighborhood around one. The ratio of the balancing 
policy’s cost to the optimal cost is 1.11 on average with 
a standard deviation of 0.049 a 95th percentile of 1.17 
and a maximum of 1.58. For the myopic the corresponding 
ratio has mean 1.60 standard deviation 0.92 95th percentile 
3.38 and maximum 8.61. This indicates that the 

Figure 6. Detailed histogram of cost as a fraction of 
optimal cost. 

0 
100 
50 
150 
200 
250 
B l$/opt$ 
Myopic$/opt$ 
1.00 1.05 1.10 1.15 1.20 1.25 



balancing policy is very robust compared to the myopic Figure 7. Performance of four policies relative to 
policy. myopic under forecast evolution. 

6.2. Experiments with Multiplicative-MMFE-Based 60 
Demand and F recast Ev luti n 

Lower myopic 
Upper myopic 
Improved balancing 
Balancing 
0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 


This test uses the experimental design of Hurley et al. 
(2006) in which the uncapacitated version of the balancing 
algorithm was tested. In all of our experiments the holding 
and backorder costs are ht = 1 and pt = 10. A horizon of 

length T = 40 was used and forecasts of demand evolve 

according to the multiplicative MMFE model (Graves et al. 

1986 Heath and Jackson 1994). The mean demand per 

Number of scenarios

50 

40 

30 

20 

INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

period averaged over the 40 periods in the time horizon 
is 400 in all cases. The capacity is 460 units per period. 

The experimental design consists of 82 scenarios. For 
each scenario we tested 1 000 random problem instances. 
The scenarios were designed to capture a variety of settings 
and characteristics. Demand and forecast variability 
can be either high or low and lead times can be short or 
long. Some scenarios study different types of seasonality in 
the demand. Others consider product launches and product 
phaseouts. Some scenarios account for the fact that many 
forecasting systems generate accurate forecasts that extend 
many time periods into the future whereas other systems 
can only forecast accurately in the near term. In addition 
shifts in forecasts can demonstrate either no correlation 
positive correlation or negative correlation. The scenarios 
are described in detail in Online Appendix B and in Hurley 
et al. (2006). 

We study five policies: myopic lower myopic upper 
myopic dual-balancing and improved balancing. For each 
of the 82 scenarios constructed and for each policy we 
examine the average per-period cost of the policy over 
1 000 runs. Note that because we consider a complex environment 
and relatively long horizon (T = 40) it is not possible 
to compute the optimal expected cost. Moreover to 
the best of our knowledge it is not even known how to 
compute reasonable lower bounds in this setting. Instead 
we use as our benchmark the myopic policy and the other 
semimyopic policies discussed in §5. The policies were 
computed using MATLAB on a standard PC. The average 
times to compute the period ordering decisions were 
0.0031 0.0738 0.0412 seconds for the myopic the minimizing 
and balancing policies respectively. 

In Figure 7 we provide histograms of the ratio of the 
cost of each policy divided by the cost of myopic. Both 
dual-balancing and improved balancing outperform myopic 
in every one of the 82 scenarios. Relative to myopic they 
provide an average saving of 27.2% and 32.4% respectively. 
Lower myopic is very close to myopic (the ratio is 
usually close to one) and is sometimes worse than myopic. 
The trend is not unexpected because myopic often under-
orders in capacitated systems and lower myopic always 
orders less than myopic. Upper myopic is virtually identical 
to improved dual-balancing which truncates the balancing 

10 

0 

Average per period cost relative to myopic 

order quantities using the order-up-to levels of upper and 
lower myopic. 

In all of our computational experiments the performance 
of the balancing policy is both strong and consistent. 
Improved balancing is better than balancing. 

7. Electr nic C mpani n 

An electronic companion to this paper is available as part 
of the online version that can be found at http://or.journal. 
informs.org/. 

Ackn wledgments 

The authors thank the associate editor and the anonymous 
referees for numerous comments that improved the content 
and exposition of this paper. This research was partially 
conducted while the first and the fourth authors were Ph.D. 
students in the School of Operations Research and Industrial 
Engineering at Cornell University. The research of the 
first author was partially supported by NSFgrants CCR9912422 
CCR-0430682 and DMS-0732175 and AFOSR 
grant FA9550-08-1-0369. The research of the second author 
was partially supported by NSFgrants DMI-0075627 and 
DMI-0500263 and the Querétaro Campus of the Instituto 
Tecnológico y de Estudios Superiores de Monterrey. The 
research of the third author was partially supported by NSF 
grants CCR-9912422 CCR-0430682 DMI-0500263 and 
CCR-0635121. The research of the fourth author was partially 
supported by NSFgrant DMI-0500263. 

References 

Aviv Y. A. Federgruen. 1997. Stochastic inventory models with lim


ited production capacity and periodically varying parameters. Probab. 

Engrg. Infor ational Sci. 11 107–135. 

Axsäter S. 1990. Simple solution procedures for a class of two-echelon 

inventory problems. Oper. Res. 38 64–69. 
Chan E. W. M. 1999. Markov chain models for multi-echelon supply 

chains. Ph.D. thesis School of Operations Research and Industrial 

Engineering Cornell University Ithaca NY. 


INFORM holds copyright to th s art cle and d str buted th s copy as a courtesy to the author(s). 
Add t onal nformat on, nclud ng r ghts and perm ss on pol c es, s ava lable at http://journals. nforms.org/. 

Chen F. J. S. Song. 2001. Optimal policies for multi-echelon inventory 
problems with Markov-modulated demand. Oper. Res. 49(2) 
226–234. 

Dong L. H. L. Lee. 2003. Optimal policies and approximations for a 
serial multiechelon inventory system with time-correlated demand. 
Oper. Res. 51(6) 969–980. 

Federgruen A. P. Zipkin. 1986a. An inventory model with limited production 
capacity and uncertain demands I: The average-cost criterion. 
Math. Oper. Res. 11(2) 193–207. 

Federgruen A. P. Zipkin. 1986b. An inventory model with limited production 
capacity and uncertain demands II: The discounted-cost criterion. 
Math. Oper. Res. 11(2) 208–215. 

Glasserman P. 1997. Bounds and asymptotics for planning critical safety 
stocks. Oper. Res. 45 224–257. 

Graves S. C. H. Meal S. Dasu Y. Qin. 1986. Two-stage production planning 
in a dynamic environment. S. Axsäter C. Schneeweiss E. Silver 
eds. Multi-Stage Production Planning and Control. Lecture Notes in 
Economics and Mathematical Systems. Springer-Verlag New York. 

Heath D. C. P. L. Jackson. 1994. Modeling the evolution of demand 
forecasts with application to safety stock analysis in production distribution 
systems. IIE Trans. 26(3) 17–30. 

Hurley G. P. Jackson R. Levi R. O. Roundy D. B. Shmoys. 2006. New 
policies for stochastic inventory control models—A theoretical and 
computational study. Working paper Cornell University Ithaca NY. 

Ignall E. A. F. Veinott. 1969. Optimality of myopic inventory policies 
for several substitute products. Manage ent Sci. 15 284–304. 

Iida T. P. Zipkin. 2006. Approximate solutions of a dynamic forecast-
inventory model. Manufacturing Service Oper. Manage ent 8(4) 
407–425. 

Janakiraman G. J. A. Muckstadt. 2003. A decomposition approach to a 
class of capacitated serial systems. Technical report TR1360 School 

of Operations Research and Industrial Engineering Cornell University 
Ithaca NY. 

Kapuscinski R. S. Tayur. 1998. A capacitated production-inventory 
model with periodic demand. Oper. Res. 46(6) 899–911. 

Katircioglu K. D. Atkins. 1996. New optimal policies for a unit demand 
inventory problem. Unpublished manuscript. 

Levi R. M. Pál R. O. Roundy D. B. Shmoys. 2007. Approximation 
algorithms for stochastic inventory control models. Math. Oper. Res. 
32 284–302. 

Lu X. J. S. Song A. C. Regan. 2006. Inventory planning with forecast 
updates: Approximate solutions and cost error bounds. Oper. Res. 54 
1079–1097. 

Metters R. 1997. Production planning with stochastic seasonal demand 
and capacitated production. IIE Trans. 29 1017–1029. 

Muharremoglu A. J. N. Tsitsiklis. 2008. A single-unit decomposition 
approach to multi-echelon inventory systems. Oper. Res. Forthcoming. 


Özalp Ö. W. Wei. 2004. Inventory control with limited capacity and 
advance demand information Oper. Res. 52(6) 988–1000. 

Roundy R. O. J. A. Muckstadt. 2000. Heuristic computation of 
periodic-review base stock inventory policies. Manage ent Sci. 46(1) 
104–109. 

Song J. S. P. Zipkin. 1993. Inventory control in a fluctuating demand 
environment. Oper. Res. 41(2) 351–370. 

Tayur S. 1992. Computing the optimal policy for capacitated inventory 
models. Stochastic Models 9 585–598. 

Veinott A. F. 1965. Optimal policy for a multi-product dynamic non-
stationary inventory problem. Manage ent Sci. 12 206–222. 

Zipkin P. H. 2000. Foundations of Inventory Manage ent. McGraw-Hill 
New York.